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to inhibit M. tuberculosis infection. Furthermore, it is known that IFN-y stimulates 
human macrophages to make 1,25-dihydroxy-vitamin D3. Similarly, EL-12 has been 
shown to play a role in stimulating resistance to M. tuberculosis infection. For a review 
of the immunology of M tuberculosis infecnon see Chan and Kaufmann, in 
Tuberculosis: Pathogenesis. Protection and Control, Bloom (ed.), ASM Press, 

Washington, DC, 1994. 

Accordingly, there is a need in the an for improved diagnostic methods 
for detecting tuberculosis. The present invention fulfills this need and further provides 
other related advantages. 

SUMMARY OF THE INVENTION 

Briefly stated, the present invention provides compositions and methods 
for diagnosing tuberculosis In one aspect, polypeptides are provided comprising an 
antigenic portion of a soluble M. tuberculosis antigen, or a variant of such an antigen 
that differs only m conservative substitutions and/or modifications. In one embodiment 
of this aspect, the soluble antigen iias one of the following N-termmal sequences: 

( a) Asn-ProA'ai-Asp-AlaA'al-Ile-Asn-Thr-'rhr-C:ys-Asn-Tyr-r I ly- 

uin-Yai-Vai-Aki-Ala-Leu (SEQ ID NO: \ 15): 
<Ai Aia-Yai-Glu-Ser-i.}iv-MetT.eu^ 

Ser (SEQ ID NO: ! 16), 
ici Aia-AlaAlet-Lys-Pro-Arg-Thr-'yiv-Asp-Olv-Pro-Leu-CViu-Ala- 

•\,a-; vs-Giu-GN Aru iSEO ID NO \ Ti, 

-SEO ID NO: ; IS), 
ie» Asp-lie-Gly-Scr-Glu-Scr-ThrAllu-Asp-Gin-Gin-Xaa-AlaA'al 

(SEO ID NO ; D>V 
: \;j-Giu-Giu-Ser-Ite Ser- fhr Xaa-Giu Xaa lie 'SEO iD 
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(h) Ak-Pro<vs-Thr-T\T-Xaa-GI^^ 
Gly fSEQ ID NO: 122): 

(i) ^-^o-Ala-Ser-Ala-Pro-^p-VaJ-Pro-Tlir-AJa-.\Ja-Gln-Leu- 

Thr-Ser-Leu-Lcu-Asn-Ser-Leu-Ala-.\sp-Pro-Asn-Val-Scr-Phe- 
Ala-Asn (SEQ ID NO: 123); 

(j) Xaa-Asp-Ser^lu-Lys-Scr-Ala-Thr-Ilc-Lys-Val-Thr-Asp-Ala^ 

Ser; (SEQ rD NO- 129) 

(k) Ala-Gly-Asp-Thr-Xaa-He-T^ 

Asp; (SEQ ID NO: 130) or 

(I) Aia-Pro-Glu-SLT-Glv-Ala-Gly-Leu-Giv-Glv-rnr-Vai-GIn-Aia- 

Gly; (SEQ ID NO: 131) 



wherein Xaa may be arty ammo acid. 

In a related aspect, polypeptides arc provided comprising an 
immunogenic portion of an M. tuberculosis antigen, or a variant of such an antigen that 
differs only m conservative substitutions and or modifications, rJie antigen having one 
of the following N-terminal sequences: 

<rm Xaa-T>T-nevAia-Ty^ 

He-Asn-Vai-Kis-Leu Vai; .;SEQ ID \Q \21\ or 
<n) Asp-Pro-Pro- Asp-Pn>-His-C]inA'aa-AsnAlet-Thr-i.vs-Giv-Tvr- 
Tyr-Pro-Glv-Glv-.-VrL^Arg-Xaa-Phe; (SEQ ID NO: iZ4) 
■■vnerem Xaa mav be an\ ammo acid 

m .mother embodiment. '\- ..mibie .1/ lunrrruuu::: ani^en jomons^, 
ammu ac:d >euue::ee encoded bv a : >\ \ .euuence selected :rom -e --van consisting «v 
:he sequences rccit ^ ^ SEO ID N()S:1. 1. a-H). . u >in d %. the 

complements or said sequences, and DN A sequences that hybridize to a sequence 

recited ::i SEO ID NOS 1 " d-'n - >v ^ . ; , jt , ■ 0 

and kJ 'i a commemen; t.nereo: under 

moderate: 1 .' rcrmoaem condition., 
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substitutions and/or modi ri cations, wherein the antigen comprises an amino acid 
sequence encoded by a DNA sequence selected from the group consisting or the 
sequences recited in SEQ ID NOS 26-51, 133, 134, 15S-1"S, I84-IS8, 194-196, 198, 
210-220, 232, 234, 235, 237-242, 248-251, 256-271, 2S7, 2SS, 290-293 and 298-33T , 
the complements of said sequences, and DNA sequences that hybridize to a sequence 
recited in SEQ ID NOS: 26-51, 133, 134, 158-178, 184-188, 194-196, 198, 210-220, 
232, 234, 235, 237-242. 248-251, 256-271, 287, 288, 290-293 and 298-337, or a 
complement thereof under moderately stringent conditions. 

In related aspects, DNA sequences encoding the above polypeptides, 
:ecombinanr expression vectors comprising these DNA sequences and host ceils 
transformed or trans fected with sucn expression vectors .ire aiso provided. 

En another aspect, the present invention provides fusion proteins 
comprising a first and a second inventive polypeptide or, alternatively, an inventive 
polypeptide and a known M. tuberculosis antigen. 

Ln farther aspects or' the subject invention, methods and diagnostic kits 
are provided for detecting tuberculosis in a patient The methods comprise: 
i a) contacting a bioioeicai sample with at least one of the above polypeptides: and 
' b » detecting m the sample the presence or antibodies that bind to the polypeptide or 
ooivpeotides. therebv detecting .1/. mncrcuiosis :niect:un m the biological samnie. 
Suitable biological samples include whole biooo, sputum, serum, nlasma, saliva, 
cerebrospinal fluid and unne. The diagnostic kits compnse one or more of the above 
polypeptides m combmauon with a detection reauent. 

; he present invention aiso provide:; methods tor detecting 
A .:ir-'rc:u^\::, ddectimi \ Miipnsmg ai ohrammg .: bio;ogicai sample rrom a patient. 
< bi contacting the sample with at least one oligonucleotide primer tn a polymerase 
.mam reaction, the oligonucleotide prmer oeing spec:r:c tor a UNA sequence encoding 
'he above poi\pept:des. and ci detecting :n die sample a DNA ^.m notice that amnufies 
:: die presence d dtc drs; and sec one oddonucleotide primer.; in die embodiment, tne 
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In a further aspect, the present invention provides a method for detecting 
M tuberculosis infection in a panent comprising: (a) obtaining a biological sample 
from the panent; fb) contacting the sample with an oligonucleotide probe specific for a 
DNA sequence encoding the above polypeptides; and (c) detecting in the sample a DNA 
sequence that hybridizes to the oligonucleotide probe. In ,onr embodiment, the 
oligonucleotide probe comprises at least about 15 contiguous nucleondes of such a 
DNA sequence. 

In yet another aspect, the present invention provides antibodies, both 
polyclonal and monoclonal, that bind to the polypeptides described above, as well as 
methods for their use m the detection of M. tuberculosis infection. 

These and other aspects of the present invention will become apparent 
upon reference to the following detailed description and attached drawings. All 
references disclosed herein are hereby incorporated by reference in their entirety as if 
each was incorporated individually. 

RRIFF DESCRIPTION OF THE DRAWINGS AND SEQUENCE IDENTIFIERS 

Figure 1 A and 3 illustrate 'he stimulation of proliferation and interferon- 
production in T jciis derived from a first ana a second M. :uni'rc:ttos:s-m\nt\unc donor. 

respectively, by the 14 Kd. 20 Kd and 1() Kd antigens described in Example 1. 

Figures 1A-D illustrate the reactivity of anti sera raised against secreton 

M tuberculosis proteins, the known \I. tuberculosis antigen Sfb .aid the inventive 

anti \jens TbbS-l and 7V>H-°. respectively, '.vith M tuberculosis //sate lane Z). M. 

tuncrcuiosis secretory oroteins i lane : i. recoTnbinant rbbS- 'lane 4.. r-ecomninant 

'""■i : ■ 3 i lane j and : 'je^mt?: riant Sjb ■.lane r > 

Figure _ ; A illustrates :he stimulation or proliferation :n a FbH- g -spee;nc 

1 ceil clone by secretory .1/ tuberculosis proteins, recombinant "T'n m - ' ^ and a control 

antigen. FbRal i 

Fnmie -B uiasiraic^ ^mniiation 1 f ;nter:eron-"- production in a Fori- 
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Figure 4 illustrates the reactivity of two representative polypeptides with 
sera from M. tuberculosis -infected and uninfected individuals, as compared to the 
reactivity of bacterial lysate. 

Figure 5 shows the reactivity of four representative polypeptides with 
sera from M. tu teratosis -inkcted and uninfected individuals, as compared to the 
reactivity of the 38 kD antigen. 

Figure 6 shows the reactivity of recombinant 38 kD and TbRal 1 
antigens with sera from M. tuberculosis patients, PPD positive donors and normal 

Figure ~ .shows the reactivity of the antigen TbRa-IA with 38 'id) 

negative sera. 

Figure 8 shows the reactivity of the antigen of SEQ ID NO: 60 with sera 
from M. tuberculosis patients and normal donors. 

Figure 9 illustrates the reactivity of the recombinant antigen TdH-29 
(SEQ ID NO: 13") with sera from M. tuberculosis patients, PPD positive donors and 
normal donors .is determined by indirect ELISA. 

Figure 10 illustrates the reactivity of the recombinant antigen TbH-33 
(SEQ ID NO: l^Oj wuii sera rrom M. tuberculosis patients ana rrom normal donors, ana 
with i pooi oi sera :rom M. tuberculosis nauents. as determined both bv direct ana 
indirect ELISA 

Figure : i illustrates the reactivity of increasing concentrations of the 
recombinant antteen TMi- " <SFp !D NO \10) with sera from M. tuberculosis :u::e:::s 
and rrom riormai aonors a:; aetermmed bv ELISA. 

figures 3 A - : - :liustrate :iie :e;u:ti vity o f T hc : ; i ;^n:f)inan: antigens MP- 
F MO-3. MO-4. MO-ZS and MO-3<F respectively, with sera from M tuberculosis 
patients and from normal donors as determined bv ELISA. 
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SEQ. ID NO. 4 is the DNA sequence of TbRa 12. 
SEQ. ED NO. 5 is the DNA sequence of TbRaI3. 
SEQ. ID NO. 6 is the DNA sequence of TbRa lb. 
SEQ. ID NO. 7 is the DNa sequence ofTbRalT. 
SEQ. ID NO. 8 is the DNA sequence of TbRalS. 
SEQ. ID NO. 9 is the DNA sequence of TbRal9. 
SEQ. ID NO. 1 0 is the DNA sequence of TbRa24. 
SEQ. ID NO. 1 1 is the DNA sequence of TbRa26. 
SEO. ID NO. 12 is the DNA sequence of TbRa2S. 
SEQ ID NO 13 is the DNA sequence of TbRa29 
SEQ- ID NO. 14 is the DNA sequence of TbRa2A. 
SEO. ID NO. 15 is the DNA sequence ofTbRao. 
SEQ. CD NO 16 is the DNA sequence of TbRa32. 
SEQ. ED NO. I" is the DNA sequence of TbRa35. 
SEO. ID NO. 18 is the DNA sequence of ThRa36. 
SEQ. ID NO 10 is the DNA sequence of TbRa4. 
SEQ. ID NO. 20 is the DNA sequence of 7bRa9. 
SEQ. ID NO. :: is trie DNA sequence ot TbRaB 
SEQ. ID NO. :: ;s the DNA sequence of TbRaO 
SEO. ID NO. 2 j is the DNA sequence of TbRoD. 
SEQ ID NO 24 ;s the DNA sequence of YYWCPG. 
SEO ID NO 2- is the DNA sequence ii'.\AMK 
SEO ID NO 2o .s the DNA ..ecueive o! I'bl Of 
SEm ID NO 2" is the DNA sequence <>i 7hL-24 
SEQ. ID NO 2S is the DNA sequence o f 7b L 25 
SEQ ID NU 2'' :s the DNA sequence or "Thi -2S 
SEQ ;D\'.' - : is tiie DNA sequence of Tbi. -2^ 
SEQ T) N( < ^' :s 'lie DN \ sequence T ^P.^ 
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SEQ. ED NO. 34 is the DN'A sequence of TbM-1. 

SEQ. ID NO. 35 is the DN'A sequence ofTbM-3. 

SEQ CD NO 36 is the DNA sequence of TbM-6. 

SEQ. ED NO. 37 is the DNA sequence of TbM-7. 

SEQ. ID NO. 38 is the DNA sequence of TbiVf-9. 

SEQ. ID NO. 39 is the DNA sequence of TbM-12. 

SEQ. ED NO. 40 is the DNA sequence of TbM-13. 

SEQ. ID NO. 41 is the DNA sequence of TbM-1 4. 

SEQ. ID NO. 42 is the DNA sequence of TbM-15. 

SEQ. ID NO. 43 is the DNA sequence of TpfM. * 

SEQ. ID NO. -U is the DNA sequence of TbH-4-FWD. 

SEQ. ID NO. 45 is the DNA sequence of TbH-12. 

SEQ. ID NO. 46 is the DNA sequence of Tb38-l. 

SEQ. ID NO 4" i. s the DNA sequence of Tb3S-4. 

SEQ. ID NO. 48 is the DNA sequence of TbL-T. 

SEQ. ID NO. 49 is the DNA sequence of TbL-20. 

SEQ ID NO. -0 is The DNA sequence of TbL-2: 

SEQ. ID NO. M is the DNA sequence of TbH-lO. 

SEQ ID NO :2 :s the DNA sequence ..f DPEP 

SEn ID NO. is the deduced amino acid sequence of DPEP 

SEQ. ID NO. 54 :s the protein sequence of DPV N -terminal Antigen. 

s?; 0 " ( ^ ' '• -he rrore::i seqiie::..- o: WOS N-Trrminal Aruigei: 

■^EQ IN N( ;"n : ; :nc protein ;equen..:e : \A\1K N^erminai Antiuen 

SEO IE 1 N< v .; :ne protein -jquence o! V V W{ ' \-rerminai Antieen 

.SEC* ID NO ~> :s :hc protein sequence of DIGS N-terminal Antigen. 

SEQ IE' NO is the protein sequence of AEES N-terminai Antigen. 

^O II) N( i ;., rhe protein sequence ofDPEP N-terminal Antigen 

<T; '' 'D NO o .'hr-rntvin •Mrn:-^: -\PKT N-terminal Antmen 
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SEQ rD NO. 64 is the deduced ammo acid sequence of TbRni. 

SEQ. ID NO 65 is the deduced amino acid sequence of TbRalO. 

SEQ. ED NO. 66 is the deduced amino acid sequence of TbRal I. 

SEQ. ID NO. 67 is the deduced amino acid sequence of TbRal2. 

SEQ. ID NO. 68 is the deduced amino acid sequence of TbRal 3. 

SEQ. ED NO. 69 is the deduced amino acid sequence of TbRal 6. 

SEQ. ID NO. 70 is the deduced amino acid sequence of TbRal". 

SEQ. ID NO. "1 is the deduced ammo acid sequence of TbRalS. 
SEQ. ID NO. ^2 is the deduced ammo acid sequence of TbRal 9 
SEQ. ID NO. "3 is the deduced ammo acid sequence of :bRa2-f 
SEQ. ID NO. is the deduced ammo acid sequence of TbRa26. 
SEQ. ID NO "5 is the deduced amino acid sequence of TbRa2S. 
SEQ. ID NO. "6 is the deduced amino acid sequence of TbRa29. 
SEQ. ID NO. is the deduced amino acid sequence of TbRa2A. 
SEM. ID NO "8 is the deduced ammo acid sequence of TbRaJ. 
SEQ. ID NO. "9 is the deduced amino acid sequence of TbRaJ2. 
SE(,>. ID NO. SO :s the deduced amino acid sequence of TbRaJ5 
SEQ ID NO. SI is ihe deduced ammo acic sequence of TbRaJ*.. 
SEQ. ID SO <Z ;s the deduced ammo acid sequence ;;f TbRa4 
bEO. ID NO. .vl is the deduced amino acid sequence of TbRaV 
SEQ. :D NO. S4 is the deduced ammo acid sequence of ";"bRaft 



:Q ID S( 



the deduced ammo ac:d sequence d ToRa^ 



ll) * A ' <f) s tiit-* deduced amino acid sequence dTnRaD 

SE<J ID \U ^ :r, the deduced amino acid sequence :>f VY\VC?' i 

SEO. II.) NO SS :s the deduced ammo acid sequence of To A AM K 

SEQ ID NO < ( > is the deduced ammo acid sequence oi Tb.?S- : 

^ v ' 1 - ll) N,t -' ' J " : - deduced ammo ac:d sequence ..fTbH-4 

^ ^ y s me deduced arm:-. kuo sequence , 
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SEQ. ID NO. 94 is the DNA sequence of DPAS. 

SEQ. ID NO 95 is the deduced ammo acid sequence of DPAS. 

SEQ. ID NO. ^6 is the DNA sequence of DPV. 

SEQ. ID NO. 97 is the deduced amino acid sequence of DPV. 

SEQ. ID NO. 98 is the DNA sequence of ESAT-6. 

SEQ. ID NO. 99 is the deduced ammo acid sequence of ESAT-6. 

SEQ. ID NO. 100 is the DNA sequence of TbH-8-2. 

SEQ. ID NO. 101 is the DNA sequence of TbH-9FL. 

SEQ. ID NO. 102 is the deduced amino acid sequence of TbH-QFT 

SEQ. ID NO. ](>3 , s the DNA sequence ofTbH-9-1. 

SEQ. ID NO. KU ts the deduced amino acid sequence of TdH-9-I 

SEQ. ID NO. 105 is the DNA sequence ofTbH-9-4. 

SEQ. ID NO. 106 is the deduced ammo acid sequence of TbH-9-4. 

SEQ. ID NO. Ur is the DNA sequence of Tb38-1F2 IN. 

SEQ. ID NO. i OS :s the DNA sequence ofTb38-IF2 RP. 

SEQ. ID NT) :r>9 1S the deduced ammo acid sequence of Tb3"L. 

SE<J. ID NO. : 10 is the deduced ammo acid sequence of 2*b3R-IN. 

SEO. ID N( J. ; I ; :s the DNA sequence of 71)38- 1F3. 

SEQ. ID \( - ; :s deduced ammo acid sequence of7b3X- i F3. 

SEQ. !D \( .. : 13 ;s the DNA .sequence of Tb3S-l F5 

SEQ. ID NO. :]4:sthe DNA sequence of Tb38-1F6. 

' D Nu • : ' - the deduced N-iermmai amino acid sequence nf DPV 
^ rJ: ' lJ VM ■ - ; ^e deduce: N-ierminai ammn acid sequence or AVGS 
SEO ID \< • ::~ the deduced Viermmal ammo acid sequence oi AAMK 
SEQ ID NO. ! ! S - the deduced \ terminal amino ac:d sequence ofVYWC 
SEU iD \U ; >i the deduced Vrermmal amino acid sequence of DIGS. 
' ]1 Nt ' ■ ' :v Educed \-terminai amino acid sequence of AAES 
- x . — '.mind! i:n:"c v-* jm.^m , r nPto 



SEQ. ID NO. 124 is the protein sequence of DPPD N- terminal Anngen. 

SEQ ID NO. 125-128 are the protein sequences of four DPPD cyanogen 

bromide fragments. 

SEQ ID NO. 129 is the N-terminal protein sequence of XDS antigen. 
SEQ ID NO. 130 is the N-terminal protein sequence of AGD antigen. 
SEQ ED NO. 131 is the N-terminal protein sequence of .APE anngen. 
SEQ ED NO. 132 is the N-terminal protein sequence of XYI antigen 
SEQ ID NO. 133 is the DNA sequence of TbH-29. 

SEQ ID NO. 135 is the DNA sequence of ThH- >2. * 

SEQ ID NO. 136 is the DNA sequence of TbH-33. 

SEQ ID NO. 13" is the predicted ammo acid sequence of TbH-29. 

SEQ ID NO. 138 is the predicted ammo acid sequence of TbH-30. 

SEQ ID NO 139 is :he predicted amino acid sequence of TbH-32. 

SEQ IT' NO. 140 is :he predicted amino acid sequence of TbH-37. 

SE<J ID NO: !4 1-!46 are PCR primers used m die preparation of a fusion 

protein containing TbRa3. 38 kD and Tb38-1. 

iEO ID NO: 14" i S t he DNA sequence of the fusion protein containing TbRa:. 
3S .;D ana -*b3>- ; 

:^EQ ID NO: \ 4;s ;s the amino acid sequence of the fusion nrotcm containing 
TbRaJ. 33 kD and Tb3S-i. 

>r ' 1 - ' ■ ~* ' - 'he DNA sequence •»!* the M tunercuiosi:- a:i!;.:er: 3S -D 
' :n N( ' : s " ;s -h- miiiio acid .euuence ot the M tunercu^sis .ion--:; ^ 



SEQ ID NO 
SEQ ID NO 

seo :d \c - 



": is the DNA sequence of XP ! 4 
: : is the DNA sequence of X?24 

: ' :s the DNA sequence .if XP ^ ! 
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SEQ ID NO: 157 is the predicted amnio acid sequence* encoded by the reverse 
complement of XP14. 

SEQ ED NO: 158 is the DNA sequence of XP27. 

SEQ ID NO: 159 is the DNA sequence cf XP36. 

SEQ ID NO 160 is the 5' DNA sequence of XP7 

SEQ ID NO 161 is the 5' DNA sequence of XP5. 

SEQ ED NO: 162 is the 5' DNA sequence of XP17. 

SEQ ID NO: 163 ls the 5 1 DNA sequence of XP30. 

SEQ ED NO: 164 is die 5* DNA sequence of XP2. 

SEQ ID NO: 165 is the 3' DNA sequence of XP2. 

SEQ ID NO: loo is the 5' DNA sequence of XP3. 

SEQ ID NO: 167 is the 3' DNA sequence of XP3 

SEQ ID NO: 168 is the 5' DNA sequence of XP6. 

SEQ ID NO 169 is the 3' DNA sequence of XP6. 

SEQ ID NO: P0 is the 5' DNA sequence of XP1 S 

SEQ ID NO: PI is the 3' DNA sequence of XP1 S. 

SEQ ID NO: P2 is the 5' DNA sequence of XPP. 

SEQ ID NO: P3 :s the 3' DNA sequence of XPP 

SEQ ID NO: P4 ts the 5' DNA sequence of XP22. 

SEQ ID NO: P< is the 3' DNA sequence of XP22. 

SEQ ID NO: P6 is the 5' DNA sequence of XP25 

Sum ID NO- ! ~~ :s the 7 DNA sequence of XP25. 

SEO ID NO- :~S :s the fuil-ieuuth DNA sequence of 7b I U-XP 1 . 

ShO iD NV . is the predicted amino acia sequence of 7bH4-XP! 

SEm iE) NO ISO is the predicted ammo acid sequence encoded bv the reverse 

complement of ThH4-XPl. 

SEO ID NO" \ S : is a first predicted amino acid sequence encoded bv XP3o 
>cO aJ N( ) i 7 ss a secorn: predicted amino ac;d sequence encoded b\ XP7; 
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SEQ ID N r O: 184 is the DNA sequence of RDIF2. 

SEQ ED NO: 185 is the DNA sequence of RDIF5. 

SEQ ED NO: 186 is the DNA sequence of RDIF8. 

SEQ ED NO: 187 is the DNA sequence of RDIFiO. 

SEQ ID NO: 188 is the DNA sequence of RDIF1 i. 

SEQ ID NO: 189 is the predicted amino acid sequence of RDJF2. 

SEQ ED NO: 190 is the predicted ammo acid sequence of RDIF5. 

SEQ ID NO: 191 is the predicted ammo acid sequence of RDIFS. 

SEQ ID NO: 192 is the predicted ammo acid sequence of RDIFIO. 

SEQ ID NO: 193 is the predicted ammo acid sequence of RDIFI I. 

SEQ ID NO: 194 1S the 5' DNA sequence of RD IF 12. 

SEQ ID NO: 195 is the 3' DNA sequence of RDIF12. 

SEQ ID NO: 196 is the DNA sequence of RDIF7. 

SEQ ID NO: 19" is ±ie predicted ammo acid sequence of RDLF". 

SEQ ID NO: 1 98 is the DNA sequence of DIF2- 1 

SEQ ED NO: 199 is the predicted ammo acid sequence of DIF2-L. 

SFO ED NO: 200-20- are PCR primers used :n the preparation of a fusion 

orotem containing TbRaJ. 38 kD. TnX-\ llIul DPEP (hereinafter referred to as 

Tb F - 2 

^EO ID NO: 208 i s ;he DNA seuuence ot the fusion protein FbF-2. 
SEQ ID NO: 209 ;s :he amino acid sequence of the fusion protein TbF-2. 
SFO ID NO ::- ,::ie-DNA ;euuence of MO ; 



SEO ID N( v : 
SEU ID NO J 
SFO ID NO : 
SEQ ID NO- 3 
SEQ ID N* .)■ 2 
SFQ ID N( i : 



■ > 'he DNA .^uence for UO-" 

2 ;s :ne - * DNA sequence lor \1< 

3 .s ihe 5' E)NA sequence for MO-S. 
- :j :he : DNA sequence for MO-9 

s ~ -s :he ^ DNA .^.equence for MO-2o 
^ Tie DNA sequence for \1( >-2S 
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SEQ ID NO: 219 is the 5' DNA sequence for MO-34. 

SEQ ID NO: 220 is the 5' DNA sequence for MO-35. 

SEQ ID NO: 221 is the predicted amino acid sequence for MO- 1. 

SEQ ID NO: 222 is the predicted amino acid sequence for MO-2. 

SEQ ID NO. 223 is the predicted amino acid sequence for MO-4. 

SEQ ID NO: 224 is the predicted amino acid sequence for MO-8. 

SEQ ID NO: 225 is the predicted amino acid sequence for MO-9 

SEQ ID NO 226 is the predicted amino acid sequence tor MO-26. 

SEQ ID NO 22 T is the predicted ammo acid sequence for MO-28. 

SEC ID NO: 22S is the predicted ammo acid sequence for MO-29. 

SEQ ID NO: 229 is the predicted amino acid sequence for MO-30. 

SEQ ID NO: 230 is the predicted ammo acid sequence for MO-34. 

SEQ ID NO: 23 1 is the predicted ammo acid sequence for MO-35. 

SEQ ID NO: 232 is the determined DNA sequence for MO-10. 

SEQ ID NO: 233 is the predicted ammo acid sequence for MO- 10. 

SEQ ID NO: 234 is the 3' DNA sequence for MO-2^. 

SEQ ID NO: 235 ;s [he ruil-lcngth DNA sequence for DPPD 

SE\> ID NO: 23o is me predicted fuJl-ieneth amino acid sequence for DPPD 

.SE'.j ID Ni <: 23" :s the determined 5' cDN A seauence lor LSER-1'» 

SE'J ID NX 1 23S is the deiennined 5* cDNA sequence for LSER-i I 

SEQ ID NO: 239 is the determined 5' cDNA sequence for LSER-12 

^'' ) ' ^ u - :he determined ^ eDN A sequence :nr I.SER-i ' 

SEQ ID NO 24; ;,; ;n c determined 5' .DNA .-.euuenee :or LSER-i ^ 

SE<J \Z) N(> 242 :s the determined 5" cDN.\ sequence for LSER-25 

^Fn ID Nt > 24^ -s the predicted amino acid sequence for I.SER-H' 

SE<J ID NO: 244 :s -Jie predicted amino acid seauence for LSER-12 

Shi, 1 ID No 24> ,s the predicted amino acid sequence for LSER-13 

SFQ ID NO s *ke Predicted amino acni -.euuerice Mi I.SFR-N> 
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SEQ ID NO: 249 is the determined cDNA sequence for LSER-23 
SEQ ID NO: 250 is the determined cDNA sequence for LSER-24 
SEQ ID NO: 251 is the determined cDNA sequence for LSER-27 
SEQ ID NO: 252 is the predicted amino acid sequence for LSER- 1 8 
SEQ ID NO: 253 is the predicted amino acid sequence for LSER-23 
SEQ ID NO: 254 is the predicted ammo acid sequence for LSER-24 
SEQ ED NO' 255 is the predicted amino acid sequence for LSER-27 
SEQ ID NO 256 is the determined 5' cDNA sequence for LSER-I 
SEQ ID NO: 25"* is the determined 5 1 cDNA sequence for F.SF.R-" 
SEQ ID NO: 258 is the determined 5' .DNA sequence for LSER-- 
SEQ ID NU: 259 is the determined 5' cDNA sequence for LSER-5 
SEQ ID NO: 260 is the determined 5' cDNA sequence for LSER-0 
SEQ ED NO: 261 is the determined 5 1 cDNA sequence for LSER-S 
SEQ ID NO: 2^! is the determined 5" cDNA sequence for LSER-U 
SEQ ID NO: 263 is the determined 5' cDNA sequence for LSER-15 
SEQ ID NO: 264 :s the determined 5" cDNA sequence for LSER-I" 
SEQ ID N<" v 265 j. s the determined 5' cDNA sequence for LSER-! l) 
SEQ ID Nu: 2oo is the determined ; ' cDNA seuuence for LSER 20 
SEQ ID \< 1 26" is the determined cDNA sequence :or LSER-22 
SEQ ID \(. ■ 268 :s the deiermineu cDNA sequence for LSER-20 
SEQ ID NO: 269 :s the determined 5' cDNA sequence for LSER 28 
SEO ID N< ' 2~'-:s the determined c " seuuence tor : SEP-? 0 

SEQ IE' \'o 2^: :ne determined - " .:. seuuence :or LSER- ■* 
SEO IE* N* v 2"Z ::; the predicted rv.r.o .:cid ;cqucnce mr LSER-: 
SEQ ID \( ■ :s the predicted .iiiimo acid sequence lot LSER— 
SEQ ID NO- 2~d [ S :he predicted ammo acid sequence for LSER-?" 
SEQ EE) Nn 2 :s the predicted .mum ic;d seuuence :or LSER-^ 
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SEQ ID NO: 2^9 is the predicted amino acid sequence for LSER-I7 

SEQ ID NO; 280 is the predicted amino acid sequence for LSER-19 

SEQ CD NO: 2S1 is the predicted amino acid sequence for LSER-20 

SEQ ID NO: 2S2 is the predicted ammo acid sequence for LSER-22 

SEQ ID NO- 283 is the predicted amino acid sequence for LSER-26 

SEQ ID NO: 284 is the predicted amino acid sequence for LSER-28 

SEQ ID NO: 235 is the predicted ammo acid sequence for LSER-29 

SEQ [D NO: 2S6 is the predicted ammo acid sequence for LSER-30 

SEQ ID NO: 23^ is the determined cDNA sequence for LSER-9 

SEQ ID NO: 2SS :s the determined cDNA sequence for the reverse complement 

of LSER-o 

SEQ ID NO: 2S9 is the predicted ammo acid sequence for the reverse 



complement 


of LSER-0 






SEQ ID NO: 


290 is the determined 5' 


cDNA sequence 


for MO- 12 


SEQ ID NO: 


2 ( M 


is the determined 5' 


cDNA sequence 


for MO- 13 


SEQ ID NO: 


292 


is the determined 5' 


cDNA sequence 


for MO- 19 


SEQ ID NO: 


2^3 


is the determined 5' 


cDNA sequence 


for MO- 3 9 


SEQ ID Nu 


2<u 


:s the predicted amino acid sequence 


tor MO- 12 


SEQ ID \'< > 


yx 


:s the predicted ammo acid sequence 


tor MO-:.. 


SEQ ID \V) 


2% 


is the predicted ammo acid sequence 


for MO-:- 


SEQ ID NO 


2T 


is the predicted ammo acid sequence 


for MO-39 


SHO ID W ■ 


: ( )s 


die determined " 


cDNA sequence 


:l"ir Enisn-l 


SEQ ID NO 




:s the determined v " 


cDNA sequence 


tor Eresn-2 


SLO ID NO 


j 1 ){ 1 


:s the determined : 


eDNA sequence 


for ::rcsn — 


SEQ ID N"( > 


3'M 


:s the determined ^~ 


cDNA sequent e 


for Erc.sn-.^ 


SEQ II.)N(j 




is the ietermmed ; 


' cDNA sequence 


for Erdsn-n 


SEQ ID NO 




:s 'he .ietermmed ; 


' cDNA sequence 


for btrasn-^ 


SEQ ID M 




is the determined ; 


..[ >N A >euuence 


li.u :\rdsn-S 
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SEQ ID NO: 307 is the determined 5' cDNA sequence for Erdsn- 1 2 
SEQ ID NO: 30S is the determined 5* cDNA sequence for Erdsn- 13 
SEQ ID NO: 309 is the determined 5' cDNA sequence for Erdsn- 1 a 
SEQ ID NO: 310 is the determined 5* cDNA sequence for Erdsn-15 
SEQ ID NO: 31 1 is the determined 5* cDNA sequence for Erdsn- 16 
SEQ ID NO: 312 is the determined 5 1 cDNA sequence for Erdsn-P 
SEQ ID NO: 313 is the determined 5' cDNA sequence for Erdsn- IS 
SEQ ID NO: 314 ts the determined 5' cDNA sequence for Erdsn-21 
— x ^ . . w . , _ ^ w^^iiiiin^u j ^ui\t\ sequence ior tirasn-_J 
SEQ ID NO: 3 lt> is the determined 5' cDNA sequence for Erdsn -23 
SEQ ID NO: 3 r is the determined 5' cDNA sequence tor Erdsn-25 
SEQ ID NO: 3 IS is the determined V cDNA sequence for Erdsn-1 
SEQ ID NO: 319 is the determined 3' cDNA sequence for Erdsn-2 
SEQ ID NO: 320 is the determined 3' cDNA sequence for Erdsn-4 
SEQ ID NO: 32! ;s the determined 3 1 cDNA sequence for Erdsn-5 
SEQ ID NO: 322 is the determined 3' cDN A sequence for Erdsn-^ 
SEQ ID NO: 323 :s the determined 3' eDNA sequence for Erdsn-S 
SEO ID NO: 324 is the determined 3' cDNA seuuence lor Erdsn-J 
SEQ ID No. 32: :s the determined 3' cDNA seuuence tor Erdsn - i'i 
SEu ID NO: 32o is the determined ^ cDNA sequence lor Erdsn-: 2 
SEO 10 NO: ^ ;s the determined 3* cDN A seauence tor Erdsn- 13 

- ~ ^ -e^rmined . , i.V , .\ .seuuence :or .:\isr.- d 

SEO ID NO 30' :s ::ie determined " .; )\,\ ia jue::ce :o: ; - 

!D NT ' - ^'~rmmed 3' cDNA :c U ucncc lo: Erd- 
SEQ ID NO: 33 I :s the determined 3" cDNA seuuence tor Enlsii-! " 
SE ° [D NO: :s thciietemiined 3' cDNA seuuence for Erdsn- !S 
Sp ° :p NO - ^ ^^rmmed 3' ,DNA ,eauen,e :or Er,i:=n-:; 
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SEQ ED NO: 337 is the determined cDNA sequence for Erdsn-24 

SEQ ID NO: 338 is the determined amino acid sequence for a M. tuberculosis 

S5b precursor homolog 

SEQ ID NO: 339 is the determined ammo acid sequence for spot 1 

SEQ ID NO: 340 is a determined amino acid sequence for spot 2 

SEQ ID NO: 341 is a determined ammo acid sequence for spot 2 

SEQ ID NO: 342 is the determined ammo acid seq for spot 4 

SEQ ID NO: 343 is the sequence of primer PDM-1 5" 

SEQ ID NO: 344 is the sequence of primer PDM-160 

SEQ ID NO: 345 is the DNA .sequence of the fusion protein TbF-o 

SEQ ID NO: 344 j s the amino acid sequence of fusion protein TbF-o 

SEQ ID NO: 347 is the sequence of pnmer PDM-1 76 

SEQ iD NO: 348 is the sequence of pnmer PDM-1 75 

SEQ CD NO: 349 is the DNA sequence of the fusion protein TbF-S 

SEQ ID NO: 350 is the ammo acid sequence of the fusion protein TbF-S 



DETAILED DESCRIPTION OF THE INTENTION 

As noted above. \he present invention :s -encrailv directed :o 
.omposit.ons and methods for u.agnosmg tuberculosa The compositions ot '.he subiec: 
invention include polypeptides that comprise a, least one antigenic pon.on of a 
U. :uberciu>ms antigen, or a variant -Much .ui amice.n that differs oniv 



conservative 



substitutions and. or modifications. :'o.ypcptides <.v,th.n -he :cone :he : , resa;? 

invention include, but arc not 'i-iree -o <niuhlr \i ••,„ , 

'■■'•■ei. .a. soiudIl .1/ .:</'<■> -:urs:s antmens A Aoiunie 

V/ :uhercuiosis antigen" « a orotem of M. ntbercuiosts ongin that :s present m 

M. :uherculos,s culture filtrate As used herein, the term 'polypeptide ' encompasses 

amino acid chains M anv ''enprh rv i M a mtI h.ii i 

«». .L.it...!. uiamy lull lenei:: proteins <;.-. intiLiens), wherein 

'he am inn acid residues ire anked -v.- ., n ,, i)1:i( ^ f .,<.. _ . r . 
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be derived from the native At tuberculosis antigen or may be heterologous, and such 
sequences may (but need not) be antigenic. 

An "antigenic portion" of an antigen ( which may or may not be soluble) 
is a portion that is capabie of reacting with sera obtained from an M. tuberculosis- 
mfected individual {i.e., generates an absorbance reading with sera from infected 
individuals that is at least three standard deviations above the absorbance obtained with 
sera from uninfected individuals, in a representative ELISA assay described herein). 
An "A/, tuberculosis -mlzcicd individual" is a human who has been infected with 
Af, tuberculosis (<?.^., has an intradermal skin test response to PPD that is at least 0.5 cm 
in diameter). Infected individuals may display symptoms of tuberculosis or may be free 
oi disease symptoms. Polypeptides comprising at least an antigenic portion of one or 
more A/ tuberculosis antigens as described herein may generally be used, alone or in 
combination, to detect tuberculosis in a patient. 

The compositions and methods of the present invention also encompass 
variants of the above polypeptides and DNA molecules. A polypeptide "variant," as 
used herein, is a polypeptide that differs rrom the recited polypeptide only in 
conservative substitutions and or modifications, such that the therapeutic, antigenic 
and or immunogenic properties of the polypeptide are retained. Polypeptide variants 
preferably exhibit at .east about ^OV more prcierably at leas; about ^0% and most 
preierabiy at least about 0 ^ r \> identity to the identified polypeptides. For polvpeptides 
with immunoreacnve properties, variants maw alternatively, be identified bv modifying 
me ammo ac:d seuuencc ^: ine or' me polypeptides, and evaluating the 

:mnuuuMcaciiv::v oi aiocmed poivpepiide For polypeptide;' useful tor the 

generation m aiaunostic mndine agents, a ear. am mav be identified pv evaluating a 
moditied polypeptide tor the ability :o generate antibodies that detect the presence or 
absence oi tuberculosis Such modified sequences may be prepared and tested usmam 
lor example, the representative procedures desenned herein 

V mea ne'e'i: \ 'vtiiv'" <t:\ .::i^:;i'i!:i , r" me :r> 'vhich an ;inini(^ 



hydropathic nature of the polypeptide to be substantially unchanged. In general, the 
following groups of amino acids represent conservanve changes: (1) ala, pro, giy, glu, 
asp, gin, asn, ser, thr; (2) cys, ser, tyr, thr; (3) val, tie, leu, met, ala. phe; (4) iys, arg, his; 
and (5) phe, tyr, trp, his. 

Vanants may also, or alternatively, contain other modifications, 
including the deletion or addinon of amino acids that have minimal influence on the 
antigenic properties, secondary structure and hydropathic nature of the polypeptide For 
example, a polypeptide may be conjugated to a signal (or leader) sequence at the N- 
teiminal end ot the protein which co-transiationailv or nost-translationailv directs 
transfer of the protein. The polypeptide may also be conmgated to a linker or other 
sequence for ease oi synthesis, purification or identification of the polypeptide u\;'., 
poly-His). or to enhance binding of the polypeptide to a solid support. For example, a 
polypeptide may be conjugated to an immunoglobulin Fc region. 

A nucleotide "variant" is a sequence that differs from the recited 
nucleotide sequence m having one or more nucleotide deletions, substitutions or 
addmons. Such modifications may be readily introduced using standard mutagenesis 
'echniqucs, such as oiigonucleotide-direcTed site-specific mutagenesis as taught, for 
example, bv Aueiman et ai. iDNA. J:! S3. 0*83). Nucleotide \ anams may be naturally 
occurring aiiehc vanants. or non-naturaiiy occurnng variants. Variant nucleotide 
sequences preferably exhibit at least about "0%, more preferably at least unout SO 0 ,, anil 
most preferably at ieas; about 1 >0X> identity to the recited sequence. Such variant 
■mcieoiuie sequences \v:i. cenerailv evnndiee to 'lie recite nucleotide -equenee uncle: 
ormeent ceudKuuiN. An .;sed herein, ■'.trumeru v)nait;oi:> ' -en-:.- *o nreu a.shin - - ; 
• muum if o\ SSC. v3" , SOS, :; vondmme at 'v ; 7 \ \\ vS<" f" SOS uemi'tht; 
followed r>v rwo washes of "0 minutes eacn in IX SSC. O.r'., SOS at X." and rwo 
cashes of 3" minutes eacn :n 0.2X SSC. X l'\> SOS at t>5 X 

m .i re:u:ed aspect, combination, or fusion, polypeptides are disclosed. A 
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joined directly (i.e.. with no intervening amino acids) or may be joined by way of a 
linker sequence (e.g., Gly-Cvs-Gly) that does not significantly diminish the antigenic 
properties of the component polypeptides. 

In general, M. tuberculosis antigens, and DNA sequences encoding such 
antigens, may be prepared using any of a variety of procedures, for example, soluble 
antigens may be isolated trorn \f. tuberculosis culture filtrate by procedures known to 
those of ordinary skill in the art, including anion-exchange and reverse phase 
chromatography. Punned antigens may then be evaluated for a desired property, such 
as the abihrv to react with sera obtained from an M. tuherculaw-mfr.ttrd individual 
Such screens may be performed using the representative methods ^escribed herein. 
Antigens may then be partially sequenced using, for example, traditional Edman 
chemistry See Edman and Berg, Eur. J. Biochem. 80\ \ 16-132, 196". 

Antigens may also be produced recombmantly using a DNA sequence 
that encodes the antigen, which has been inserted into an expression vector and 
expressed m an appropriate host. DNA moiecuies encoding soluble antigens mav be 
isolated by screening an appropriate M. tuberculosis expression library with anti-sera 
u:.^.. rabbit) raised specifically against soluble \f rubercuiosis antigens. DNA 
sequences encoding antigens mat mav or may not be soiubie ma\ be identified bv 
screening an appropriate M. runcrciuosis genomic or cDNA expression .ibrary with sera 
ontained from patients mfected with .1/. :unercutosis Such screens may generally be 
performed using technic ues well known m the an. such as those described m Sambrook 
:t a;.. y!(Ut:c:uar , .omni' i ...irnrator. \! t :n:d;.. ■ Aid Sprme Marror .uDoratones. 
■ ^ iid Sprung Harbor. N V. ! ov* 

DNA ^ecuenccf encodinc ;o:ubie antigens mav a:sc v e obtained 
screenmg an appropriate M :uhrrcuias:s cDNA or genomic DNA Ynrarv for DNA 
sequences diut hybridize to degenerate oligonucleotides derived :rom partial ammo acid 
sequences ot isolated -;omnte antigens Degenerate oligonucleotide sequences for use m 
sucn a screen ma\ oc designed and ;vnTh~^/-ed >ro ovr — -v'-formed. a- 
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therein). Polymerase chain reaction (PCR) may also be employed, using the above 
oligonucleotides in methods well known in the art, to isolate a nucleic acid probe from a 
cDNA or genomic library. The library screen may then be performed using the isolated 
probe. 

Regardless of the method of preparation, the antigens described herein 
are "antigenic." More specifically, the antigens have the ability to react with sera 
obtained from an M. tuberculosis -infected individual. Reactivity may be evaluated 
using, for example, the representative ELISA assays aescnbed herein, where an 
absorbance readme with sera from inferred individuals rhar is 

deviations above :he absorbance obtained with sera from uninfected individuals is 
considered positive. 

Antigenic portions of M. tuberculosis antigens may be prepared and 
identified using well known techniques, such as those summarized in Paul, 
Fundamental Immunology, 3d cd., Raven Press, 1993, pp. 243-24" and references cued 
therein. Such techniques include screening poiypeptide portions of the native antigen 
for antigenic properties. The representative ELISAs described herein mav generally be 
employed in these screens. An antigenic portion of a polypeptide is a oonion that, 
within such representative assays, generates a sienai in such assays thai is substantially 
similar to that generated by the ruil length antigen. In other words, an antigenic oonion 
or a .1/. tuberculosis antigen generates at least about 20° „. and preferably about 100°;,. 
of the signal induced by the full length antiLzen in a model ELISA as described herein. 

Portions and other vtir.ants of 1/ :ufH>rcuitisis antigens mav be generated 
wnthenc ir recombinant mean:, Svnthetv ;n>iypeptk:es ha vine :e^er than aoout 
:'") amino acids, and generally -wer 'nan about c 'i ammo actus, mav Se generated 
using techniques wei] known m the an. ;-or example, such polypeptides may be 
synthesized using any of the commercially avaiiabie soiid-nhase techniques, such as the 
Memiieid soiid-nnase synthesis method, where ammo acido are sequential added to a 
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according to the manufacturer's instructions. Variants of a nauve anugen may generally 
be prepared usmg standard mutagenesis techniques, such as oiigonucieotide-directed 
site-specific mutagenesis. Sections of the DNA sequence may also be removed using 
standard techniques to permit preparation of truncated polypeptides. 

Recombinant polypeptides containing portions and/ or variants of a 
native antigen may be readily prepared from a DNA sequence encoding the polypeptide 
using a variety of techniques well known to those of ordinary skill m the an. For 
example, supernatants from suitable host/vector systems which secrete recombinant 
nrmfvn into rnltiir^ media ma v be first concentrated usintj a commercial! v available 
filter. Following concentration, the concentrate may be applied to a suitabie 
purification matrix such as an affinity matrix or an ion exchange resin. Finally, one or 
more reverse phase HPLC steps can be employed to further purify a recombinant 
protein. 

.Any of a variety of expression vectors known to those of ordinary skill m 
the an may be employed to express recombinant polypeptides as described herein. 
Expression may be achieved in any appropriate host cell that has been transformed or 
iransfected with an expression vector containing a DNA molecule that encodes a 
recombinant polypeptide. Suitable host ceils include prokaryotes. veast and runner 
eukaryotic ceils. IVeieraniy, the nost ceils empioved are :f. coii. veast or a mammalian 
eel! line, such as COS ,ir CHO The DNA sequences expressed m this manner mav 
encode narurailv occurring antigens, portions of naturally occurring antigens, or other 
■.■.u:an:s 'hereof 

:n -^"era;. regardless >f 'he method >f preparation. 'He poivpeptides 
disclosed nerem are prepares :ri ti nstant: ai 1 \ pure :om Prcferabiv. :ne polypeptide.; 
are at least about S0 IJ , pure. :noie preferably at least apoui 00° n p U ir (1IK i mos r 
prererabiv at ieas: about ° ( }% pure For use m the methods described herein, however. 

such supstant;ail\ pure polypeptides ;nav ->e combined 
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antigen (or a variant of such an antigen), where the antigen has one of the following N- 
terminal sequences: 

(a) Asp-Pro-Val-Asp-Ala-Val-n^^ 

Gin- Val-Val-Ala- Ala-Leu (SEQ ID NO: 1 15); 

(b) Aia-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 
Ser (SEQ ID NO: 116); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- 
Ala-Lys-Glu-Gly-Arg (SEQ ID NO: 1 17); 

(d) TvT-Tyr-Tm-rys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- 
ProiSEQ ID NO ilS); 

te) .-\sp-ne-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gin-Gln-Xaa-Ala-Vai 
(SEQ ID NO: 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Tlir-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
NO: 120); 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser- 
Pro-Pro-Ser (SEQ ID NO: 121); 

( h ) Ala-Pro-Lys-Thr-T>T-Xaa-Glu-Glu-Leu-Lys-Gly-Thr- Asp-Thr- 

Gly (SFQ ID NO- :22); 
( 1 1 Asp-Pro- Aia-Ser-Ala-Pro-Asp-Vai-Pro- Fhr-A ia-Ala-( j tn-v lin- 

Thr-Ser- Leu-Leu- Asn-Ser-Leu- A la- Asp-Pro- Asn-Val-Ser-Phe- 

Ala-Asn (SEQ ID NO !23); 
:j i Xoa-Asp-SeM jia-Lvs-Ser- Aia-Thr-Ile-i.'. — Vai-Thr- Asp- Aia- 

Ser; SL r ) :L> \i 1 :2"» 
<ki Aia-Gh - Asp-T 1 :: \;ia-lle - "vr-Me A'ai-» my- Asn- Yhr- Aid 

Asp; (SEQ LD NO: 130) or 
( 1 ) Ah-Pro-Glu-Ser-Giy-Ala-Gly-Leu-t jiy-Giy Thr-Val-Gin- Aia- 

Glv. iSEQ ID NO. \ }\ < 
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encoding the antigen identified as (a) above is provided m SEQ DD NO: 96; its deduced 
amino acid sequence is provided m SEQ rD NO: 97. A DNA sequence corresponding 
to antigen (d) above is provided in SEQ ID NO: 24, a DNA sequence corresponding to 
antigen (c) is provided in SEQ ID NO: 25 and a DNA sequence corresponding to 
antigen (I) is disclosed in SEQ ID NO: 94 and its deduced amino acid sequence is 

provided in SEQ ID NO: 95. 

In a further specific embodiment, the subject invention discloses 

polypeptides comprising at least an immunogenic portion of an \f. tuberculosis antigen 

having one of the following N- terminal sequences, or a variant thereof rhnt differs oniv 

;n conservative substitutions and/or modifications: 

(m) Xaa-TvT-Ilc-AIa-T>T-Xaa-Thr-TIir-Ala-Glv41eA''al-Pro-Glv-Lys- 
Ile-Asn-Val-His-Leu-Val; (SEQ ID NO: 132) or 

(n) ^P-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-T>T- 
TvT-Pro-GIy-Gly-Arg-Arg-Xaa-Phe; (SEQ ID NO: 124) 
wherein Xaa may be any amino acid, preferably a cysteine residue. . A DNA sequence 
encoding die antigen of in) above is provided in SEQ ID NO: 235, with the 
corresponding predicted full-length ammo acid sequence being provided m SEQ ID 



NO- :?( 



In other specific embodiments, die subject invention discloses 
polypeptides comprising at least an antigenic portion of a soluble XL tuberculosis 
mti-eii .or a variant or" such an ammenl that comprises one or mure M *hc amino acid 
Alienees encoded by mi -he DNA -.euuences of SEO ID NOS : . L * >. !3-:5. 
^: >V .n,;he complements of sucn !)\A seuuences. v v DNA sequences 
substantially homologous to a sequence m on or (b). 

In rurthe: specific embodiments, the subject invention discloses 
poivpeptides compnsini: at least an antigenic portion of a M :uht>rc:iiosis antigen mr a 
variant v ;uch an mtiuenv which may - . T ,.r. o<- >v <rtv.W lc 'ha- --nnnsi: 



^ v. e 
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242, 248-251, 256-271. 287. 288. 290-293 and 298-33'. ,b) the complements of such 
DNA sequences or (c) DNA sequences substantially homologous to a sequence ,n <a> or 

(b). 

In a related aspect, the present invention provides fusion proteins 
comprising a first and a second invenuve polypeptide or, alternatively, a polypeptide of 
the present invention and a known .V/. tuberculosis antigen, such as the 38 kD antigen 
described in .Andersen and Hansen, Infect Immun. 57:2481-2488. 1989. (Genbank 
Access.on No. M30046, or ESAT-o (SEQ ID \OS: 98 and 99), together with variants 
of such fusion proteins. The fusion 
linker peptide between the lirst and second polypeptides. 

A DNA sequence encoding a fusion protein of the present invention is 
constructed using known recombinant DNA techniques to assemble separate DNA 
sequences encoding the first and second polypeptides into an appropriate expression 
vector. The 3' end of a DNA sequence encoding the first polypeptide is ligated, with or 
without a peptide linker, to the f end of a DNA sequence encoding the second 
polypeptide so that the reading frames of the sequences are in phase to permit mRNA 
translation of the two DNA sequences into a single fusion protein that retains the 
huoiogicai activity of both the first and the second polypeptides. 

A peptide linker sequence mav be employed to separate the first and the 
second polypeptides bv a distance sufficient to ensure that eacn polypeptide folds ,nto 
its secondary and ternary structures Such a peptide linker sequence ,s incorporated into 
fusion prote.ii using standard techniques well known :n the .in. Su.tanie pept.ue 
anker sequences mav be ;hoser. -aseu .,n the -oilowint: :.icu*z . 1 . tnci: ,„>iiif. •„ 
adopt a .'lexihie extended conrbrrnatio;:: • 2 : their :nab,hiv to !■ top; a «ecenJarv .tracts 
•aaai could interact with runct.onai epitopes on the first and second polypeptides: and 
.3. the lack af hydrophobic or cnarged residues that might react w.th the polypeptide 
:anct;onai epitopes ?rcfe:red pept.de linker sequences contain Giv. Asn and Ser 
residues Other near r.eutra: iir-r" !.•■•■ ;<>-<~ ■ ... ; \ ...... . 
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Aar/. Acad. Sci. USA e?J:S258-S562, 1986; U.S. Patent No. 4,935,233 and U.S. Patent 
No. 4,751,180. The linker sequence may be from 1 to about 50 ammo acids in length. 
Peptide linker sequences are not required when the first and second polypeptides have 
non-essential N- terminal amino acid regions that can be used to separate the runcuonaJ 
domains and prevent steric hindrance. 

In another aspect the present invention provides methods for using the 
polypeptides described above to diagnose tuberculosis. In this aspect, methods are 
provided for detecting M. tuberculosis infection in a biological sample, using one or 
more of the above polypeptides, alone or in combination. In embodiments in which 
multiple polypeptides arc employed, polypeptides other than those specifically 
described herein, such as the 38 kD antigen described m Andersen and Hansen, Inject. 
Immun. 5~ 2481-2488, 1989, may be included. As used herein, a "biological sample" is 
any antibody-containing sample obtained from a patient. Preferably, the sample is 
whole blood, sputum, serum, plasma, saliva, cerebrospinal fluid or urine. More 
preferably, the sample :s a blood, serum or plasma sample obtained from a patient or a 
blood supply. The polypeptide! s) are used m an assay, as described below, to determine 
the presence or absence of antibodies to the polypeptides) in the sample, relative to a 
predetermined cut-oif value. The preser.ee of such antibodies indicates previous 
sensitization to mvrobactenai antigens which mav be indicative of tuberculosis. 

In embodiments in which more than one polypeptide is employed, the 
polypeptides used are preferably complementary (i.e. one component polypeptide will 
'end ro detect infection ::i samples where the infection would not be detected bv another 
:omponcm polypeptide i ' \)mpicme:uarv polypeptides mav -eneraih : damned m 
ising each polypeptide muivuiuailv :,> evaluate serum samples untamed :rom a senes of 
patients known to be injected with .1/. tuberculosis. After determining wnich samples 
test positive (as described below » with each polypeptide, comouuuions of two or more 
polypeptides ma> be formulated thai are capable ofdetectine infection in most, or all. ol 
'he dimples tested Such polypeptides are complementary i-o r e\amp:e. .inpm\:matei\ 
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polypeptides may, therefore, be used in combination with the MS kD antigen to improve 
sensitivity of a diagnostic test. 

There are a variety of assay formats known to those of ordinary skill in 
the art for using one or more polypeptides to detect antibodies in a sample. See, e.g., 
Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, 
1988, which is incorporated herein by reference. In a preferred embodiment, the assay 
involves the use of polypeptide immobilized on a solid support to bind to and remove 
the antibody from the sample. The bound antibody may then be detected using a 
detection reagent that contains a reporter group. Suitable detection reagents include 
antibodies that bind to the antibody/polypeptide complex and free poivpeDtide labeled 
with a reporter group [e.g.. in a semi-competitive assay). Alternatively, a competitive 
assay may be utilized, m which an antibody that binds to the polypeptide is labeled with 
a reporter group and allowed to bind to the immobilized antigen after incubation of the 
antigen with the sample. The extent to which components of the sample inhibit the 
binding of the labeled antibody to the polypeptide is indicative of the reactivity of the 
sample with the immobilized polypeptide. 

The solid support may be any solid material known to those of ordinary 
skiil in the an 10 which the antigen may he attached. For example, the soiid support 
may ne a test well ;n a microliter plate or a nitrocellulose or other suitabie membrane. 
Alternatively, the suppon may be a beau or disc, sucn as glass, fiberglass, latex or a 
plastic matenal such as polystyrene or poivvmylchloride. The support may also be a 
magnetic oamcie or a fiber omic censor, such ±, those disclosed, for example. :n V S. 
latent No j.^.ori i 

Hie puivpepuues ::ia\ he bounu to the soiut .uiuport us:nu a vanet\ oi 
techniques known to those of ordinary skill in the an. which are amply described in the 
patent and scientific literature. In ihe context of the present invention, the term "bound" 
:eiers to ooth nuncm aient association, such, as adsorption, and covaient attachment 
1 wtiicn ma\ be a uirect hriKa^e between the aruoee:; ano :unctionUi groups on the 
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be achieved by contacting the polypeptide, in a suitable buffer, with the solid support 
for a suitable amount of time. The contact time vanes with temperature, but is typical! v 
between about 1 hour and 1 day. In general, contacting a well of a plastic microtiter 
plate (such as polystyrene or polyvinylchloride) with an amount of polypeptide ranging 
from about lOng to about I ug, and preferably about IGOng, js sufficient to bind an 
adequate amount of antigen. 

Covalent attachment of polypeptide to a solid support may generally be 
achieved by first reacting the support with a bifunctiunai reagent mat will react with 
both the support and a functional group, such as a hydroxy! or ammo group, on the 
polypeptide. For example, the polypeptide may be bound ro supports having an 
appropriate polymer coating using benzoquinone or by condensation of an aldehyde 
group on the support with an amine and an active hydrogen on the polypeptide (sec, 
e.g. . Pierce Immunotcchnology Catalog and Handbook, 1 99 1 , at A 1 2-A 1 3 ). 

In certain embodiments, the assay is an enzyme linked immunosorbent 
assay ( ELISAi. This assay may be performed by first contacting a polypeptide antigen 
hat has been immobilized on a solid support, commonly the well of a microtiter plate, 
■vi th 'he sample, such that antibodies :o :he polypeptide within the sample are allowed 
o bind :o the immobilized polypeptide. Unbound sample is then removed ;rom the 
mmomhzed polypeptide and a detection reagent capable of binuim; to the immobilized 
mtibodv-poiypepude complex is added. The amount of detection reauent that remains 
bound :o the solid support is then determined using a method appropriate for the 
-pec;::c detection reauent 

More snecnicailv. once 'he polypeptide :s :mmonih/eu on the support a- 
lescnoed uoove. the remaining protein binding sites on the support arc -vpieaHv 
biock-d Anv suitable blocking agent known to those of ordinary skill m the an, such 
as bovme serum albumin or Twcen 20™ * Sigma Chemical Co.. St. Louis. MOi mav be 
cmplovcd. The immobilized polypeptide is then incubated with the- sample, and 
ar.tmoo \ > aiiowcd to nmd to 'he ammer. The sample mav ^e diluted wth a suitable 
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detect the presence of antibody within a M. :unercuiosis-iniectcd sample. Preferably, 
the contact nme is sufficient to achieve a level of binding that is at least 95% of that 
achieved at equilibrium between bound and unbound annbody. Those of ordinary skill 
in the an will recognize that the time necessary to achieve equilibrium may be readily 
determined by assaying the level of binding that occurs over a penod of time. At room 
temperature, an incubation time of about 30 minutes is generally sufficient. 

Unbound sample may then be removed by washing the solid support 
with an appropriate buffer, such as PBS containing 0.1% Tween 20™. Detection 
reagent may then be added to the solid support. An appropriate detection reagent is anv 
compound that binds to the immobilized nmibody-polypeptide complex and that can be 
detected by any of a variety of means known to those in the an. Preferably, the 
detection reagent contains a binding agent (such as, for example. Protein A, Protein G, 
immunoglobulin, lectin or free antigen) conjugated to a reporter group. Preferred 
reporter groups include enzymes (such as horseradish peroxidase), suosrrates, cofactors. 
inhibitors, dyes, radionuclides, luminescent groups, fluorescent groups, biotin and 
coihodal particles, such as colloidal gold and selenium. The conjugation of binding 
agent to reporter grout: may be achieved using standard methods known to those of 
ordinary .skill m the an. Common binding agents may aiso be purchased coniugatca to 
a variety oi reporter groups from main- commercial .sources id.-. Zymed Laboratories. 
San Francisco, CA, and Pierce, Rockford. IL). 

I he detection reagent is then incubated with the immobilized anubodv- 
puiypeptide complex tor an amount of time sufficient to detect the bound anubodv An 
appropriate amount ,>f time :nav —cram -e determined from *ne manufacturer 
instructions or rn assaying :he /eve: of binding that occurs over a oenod of nme 
Unbound detection reagent ts then removed and bound detection reagent is detected 
using the reponer group The method employed for detecting the reporter group 
depends upon the nature or the reporter group For radioactive groups, scintillation 
counting or autoradiographic methods are -enerailv appropriate Spectroscopic 
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radioactive or fluorescent group or an enzyme). Enzyme reporter groups mav generally 
be detected by the addition of substrate (generally for a specific period of time), 
followed by spectroscopic or other analysis of the reaction products. 

To determine the presence or absence of anri-M tuberculosis antibodies 
in the sample, the signal detected from the reporter group that remains bound t 0 the 
solid support is generally compared to a signal that corresponds to a predetermined cut- 
off value. In one preferred embodiment, the cut-off value is the average mean signal 
obtained when the immobilized antigen is incubated with samples from ;m uninfected 
patient. In general, a samDie ^eneratmi? a sismni thai is rhr^ srnnHnrH rtoviTTinnr Thru in 
the redetermined cut-off value ;s considered positive for tuberculosis. In an alternate 
preferred embodiment, the cut-off value is determined using a Receiver Operator Curve, 
according to the method of Sackett et ai.. Clinical Epidemiology: A Basic Science for 
Clinic.:! Medicine. Little Brown and Co., 1985, pp. 106-107. Briefly, in this 
embodiment, 'die cut-off value may be determined from a plot of pairs of true positive 
rates [i.e., sensitivity) and false positive rates ( 100%-specificity) that correspond to eacn 
possible cut-off value for the diagnostic test result. The cut-off value on the plot that is 
the closest to the upper left-hand corner \i.e., the value that encloses the largest area) is 
the most accurate :at-otf value, and a sampie generating a signal that is higher than the 
juL-otr vaiue determined hv this method mav ^e considered positive. Alternative^. :he 
cut or; vaiue mav be shitted to the ieft along the plot, to minimize the false positive 
rate, or to the right. :o minimize the false negative rate, m general, a samnie generating 
a -v—u mat :s muher than 'he cut -off value determined nv <his method :> -onsuiered 
:\vdt:ve tunercuiosis. 

ai a related embodiment, the assav : ; -erformed :n a ran: J :!ow-r. K .rnu«h 
or strm test tormat, wherein the antigen :s immobilized on a membrane, such as 
nitrocellulose. In the fiow-thmugh test, antibodies within the samnie bind to the 
:iTli:i,f ' niil/l " i Polypeptide .is the simple nasses through the membrane \ detection 
reagent v c . nrotem \ ooi>:dai rh<. M ; <n r .i; ^.-v,.;. v.. >.;,., 
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strip test format, one end of the membrane to which polypeptide is bound is immersed 
in a solution containing the sample. Die sample migrates along the membrane through 
a region containing detection reagent and to the area of immobilized polypeptide. 
Concentration of detection reagent at the polypeptide indicates the presence of ami- 
M. tuberculosis antibodies in the sample Typically, the concentration of detection 
reagent at that site generates a pattern, such as a line, that can be read visually. The 
absence of such a panern indicates a negative result. In general, the amount of 
polypepudc immobilized on the membrane is selected to generate a visually discernible 
pattern when the biological sample contains a level of antibodies that would be 
sufficient to generate a positive signal ;n an HAS A. as discussed above. Preferably, th 
amount of polypeptide immobilized on the membrane ranges from aoout 25 ng to aoout 
1 ug, and more preferably from about 50 ng to about 500 ng. Such tests can typically 
be performed with a very small amount (eg., one drop) of patient serum or blood. 

Of course, numerous other assav protocols exist that are suitable for use 
with the polypeptides of the present invention. The above descriptions are intended to 
be exemplar/ only. 

in yet another aspect, the present invention provides antibodies to the 
mventive noi vnemides. Antibodies may oe prepared bv any of a variety of teciuuuues 
Known to tiiose of ordinary skill :r. the art. \v. . liariow and Lane. Antibodies- I 
idbonuur- .\L:miaL Cold Spring Harbor Laboratory. i^SS. in one such technique, an 
immunogen comprising the antigenic polypeptide is initially injected into any of a wide 
vanetv ai mammais ance. rats, ranbits. sheen ana joatsi ;n this Uerv tho 

iioivpcpudes of this invention rnay ;erve a:, the immunogen without modi neauon 
vitemaiiveiv. oanicuiaiiv :or rciativeK short ;>ni vnrrmoes. a superior immune response 
:na\ be ciicited if the polypeptide is lomed to a earner protein, such as bovine scrum 
albumin or kevhoie ampet hemocyamn. The immunogen :s miected into the annnai 
.10... pretemrv. according to a predetermined schedule incorporating one or more 
roster immunization:,, and the anirrui,- .ire Med periodical!'. !>mvciona: mtmodie'; 
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Monoclonal antibodies specific for the antigenic polypeptide of interest 
may be prepared, for example, using the technique of KohJer and Milstein, Eur J 
Immunol. 5:511-519, 1976. and improvement, thereto. Briefly, these methods involve 
the preparation of immortal cell lines capable of producing antibodies having the 
desired specificity (,.,. .eactivity with the polypeptide uf interest, Such ceil lines mav 
be produced, for example, from spleen cells obtained from an animal immunized as 
described above. The spleen cells are then immortalized by. for example, fus.on w,th a 
mveloma cell fusion partner, preferably one that ,s syngeneic W ,th the immunized 
animal. A variety of fusion techniques mav be emDloved. For «nmnl, ? h, ~.m, 
3,1(1 :nVd ° ma CdlS maV be COmblned ^'th a non.omc detergent tor a few TOnutes wd 
then plated at low density on a selective medium that supports the ^nvth of hvond 

cells, but not myeloma cells. A preferred selection technique uses 1LAT (hjpoxanthme. 

aminopterin. thymidine, selection. After a sufficient time, usually about 1 to 2 weeks. 

colonies of hybrids are observed. Single colonies are selected and tested for bmding 

acnv,rv against the polypeptide. Hvbndomas hav.ng high reacnv.ty and .spec .fiery J 

preferred. 

Monoclonal antibodies may be isolated from the supernatant, of grown* 
.vnncioma .oiomes. in addition, various techn.uues mav be employed :o enhance the 
y.ehi. such -is m,ec:,on of the nvonooma eel: ane -. m , ti,e peritonea, cavuv of a ...tame 
•• eneorate host, such as a mouse. Monoclonal antibodies mav then be harvested trom ' 
the asctes ;lu,d or the biood. Contaminants mav be removed trom the antibodies bv 
-'° nVeRtl0nai :eCnni ° Ue ' ; - ™° - — .atourat.r.. ee: rlitmt.on. .recmuauon. and 
traction. - e -;v;., :! :,e, , tn.s :„vertt:on ttt.r. :n ::te punricaann xv. . 

-'xampie. an ai'finitv JnromatocraDnv men 

Ant,boa.cs may »* , sed in diagnostic tests to detect the presence of 
V :un„cun>s:s .tnt.gens usuig assays s.m.lar to those deta.lea anovc .ma other 

'.ej;::;:oues \vc\] Lnown :o those of skiM m 

1 SKUl ir ' - lL .herein pnn^me .> rneihixi lor 

.ieteciim: A/, mbercutosi:; infection :r: a natier:: 
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thereof. For example, at least two oligonucleotide pnmers may be employed in a 
polymerase chain reaction (PCR) based assay to amplify M. tuberculosis -specific 
cDNA derived from a biological sample, wherein at least one of the oligonucleotide 
pnmers is specific for a DNA molecule encoding a polypeptide of the present invention. 
The presence of the amplified cDNA is then detected using techniques well known in 
the art, such as gel electrophoresis. Similarly, oligonucleotide probes specific for a 
DNA molecule encoding a polypeptide of the present invention may be used in a 
hybridization assay :o detect rhe presence of an inventive polypeptide in a biological 
sample. 

As used herein, the :erm "oligonucleotide pnmer probe specific for a 
DNA molecule* 1 means an oligonucleotide sequence that has at ieast about 30%. 
preferably at least about 90° o and more preferably at least about 95V identity to the 
DNA molecule in question. Oligonucleotide pnmers and* or probes which may be 
usefully employed m the inventive diagnostic methods preferably have at ieast about 
10-40 nucleotides. In a preferred embodiment, the oligonucleotide pnmers comprise at 
least about in contiguous nucleotides of a DNA molecule encoding one of the 
polypeptides disclosed herein. Preferably, oligonucleotide probes for use in the 
:nventive diagnostic methods comprise at least aoout ;5 contiguous oligonucleotides of 
a DNA molecule encoding one of :hc -oivn^miaes disclosed herein, lecnniaues :or 
oom PCR based u_ssa\s and hybridization assays are well Known :n ;nc art isee. tor 
example. Mullis ct ui. Ibid: Ehrlicn. Ibid). Primers or nrones may :hus be used to detect 
:iincrruic,s:s-<?cz\\w seuuenc-. .:: 'uo:ngi-n .amnie^ D\'A crones >r pnmer? 
mmiisinu >i:doni:cieotiue -eduence, descrmed anov.j :::a .. x . used alone. m 
'ombmaiion -.vim eacn •Mhcr. t:::: — . m^: defined ^uuenees vac:: as d- : <S vn 
antigen discussed .move 
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PVTUHQ-nQN AMP CHARACTER r7ATT QN OF Pr>| VP FP TTP FF 

££QMjV/. TUBZRCl 'i.ous n n t. ;ff f [--[, 7R TF 

This example illustrates the preparation of M. tuberculosis soluble 
polypeptides from culture nitrate. Unless otherwise noied. all percentages in the 
following example are weight per voiume. 

\i. :ubercuiosis (either H37Ra. ATCC No. 251" or FOTRv. .\TC.C 
No. 25olS) was cultured m stcnic GAS media at 3~°C for fourteen days. The media 
was then vacuum filtered (leaving the bulk of the cells) through a 0.45 u filter into a 
sterile 15 L bottle. The media was then filtered through a 0.2 u filter into a stenle 4 L 
bottle. NaN, was then added to the culture filtrate to a concentration of 0.0-1",. The 
bonks were then placed 111 a 4 n C cold room. 

The culture filtrate was concentrated by placing the filtrate in a 12 L 
reservoir that had been uutoclaved and feeding the filtrate into a 400 ml Amicon stir cell 
which naci been rmscd with eiharioi and contained a i 0.000 kDa MWCO membrane 
The pressure was maintained at w p.M using nitrogen gas. Th.s procedure reduced t:ie 
'.2 L voiume to approximateiv ^0 ml. 

The culture filtrate was then diaiyzed into 0.!";, ammonium bicarbonate 
:,m, , >,)<-.) kDa MWCT, cciluiosc ,ster nembnmc. w„h chanues ,f ammomun. 
-ic^onate solution ^ : , m •i.ni-er.iiat.on -x:i; titer, .ieterrr.ir.ed bv 4 a>mmerc:.i,;-. 
iv.niabie BC.\ assav ■ Pierce. Rocktord. !! 

Tne diaiyzed culture filtrate was then iyophilizcd. and die poiypepttdes 
rcsuspenaed m distilled water The polypeptides were then diaiyzed against 0.0 1 rn.M 

blSltHbl hvdrO\'.TlL^h\'! 1 -me* Pv; 'lminniprnn-in^ V> T ~ - ,n,- T 

■ ■ .aiinno "propane, mi t M 1 s - ! : : s propane nunc::. 
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Bis-Tns propane buffer pH T5. Polypeptides were eiuted with a linear 0-0.5 M NaCl 
gradient in the above buffer system. The column eluent was monitored at a wavelength 
of 220 nm. 

The pools of polypeptides eluting from the ion exchange column were 
dialyzed against distilled water and lyophilized. The resulting material was dissolved :n 
0.1% mfluoroacetic acid (TFA) pH 1.9 in water, and the polypeptides were punned on 
a Delta-Pak CIS column (Waters, Mil ford, MA) 300 Angstrom pore size, 5 micron 
panicle size (3.9 x 150 mm). The polypeptides were eluted from the column with a 
linear 'gradient nrorri 0-60% dilution buffer (0 1% TFA m ;ipf*?nmrnlf*l The flow ^-n^ 
was 0.~5 inlyininute and the HPLC eiuent was monitored at ZUnm. Fractions 
containing the eiuted polypeptides were collected to maximize the purity of trie 
individual samples. Approximately 200 purified polypeptides were obtained. 

The purified polypeptides were then screened for the ability to induce T- 
cell proliferation in PBMC preparations. The PBMCs from donors known to .he PPD 
skin test positive and whose T ceils were shown to proliferate m response to PPD and 
crude soluble proteins from MTB were cultured in medium comprising RPMI 1640 
supplemented with 10% pooled human scrum and 50 ugmi gentamicin. Purified 
polypeptides were added ;n duplicate at concentrations of 0.5 to -OugmL. After six 
aavs oi culture in '^-weii round-bottom plates m a volume of ZOO li;, 50 ul of medium 
was removed from each well for determination of IFN-v levels, as desenbed below 
The plates were then puised with I uCi.well of tntiated thymidine for a limner ;X 
hours, narcested and *ntmm uptaKe ietermined asima a _:as scintillation counter 
fractions that resulted :r. ornnfer.aruoi a 'Kim rephcutes three fold areater than trie 
omiiirratiun unserved m cells cuitured m medium aione were considered positive 

IPN-v was measured using an enzyme- hriked immunosorbent assay 
ihLISAi. hLiSA plates were coateii with a mouse monoclonal antibody directed to 
auman IFN-v -Chemiconi m PBS for four hours at room temperature Wells were then 
cocked with PBS oontaminr v ' VY \ 1 n<m fa* ireo —ft >v • hour .r - 
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room temperature. The plates were again washed and a polyclonal rabbit anti-human 
LFN-v serum diluted 1:3000 in PBS/10% normal goat serum was added to each well. 
The plates were then incubated for two hours at room temperature, washed and 
horseradish peroxidase-coupled anti-rabbit IgG (Jackson Labs.) was added at a 1:2000 
dilution in PBS/5% non-fat dned milk. After a further two hour incubation at room 
temperature, the plates were washed and TMB substrate added. The reaction was 
stopped after 20 mm with 1 N sulfuric acid. Optical density was determined at 450 nm 
using 570 nm as a reference wavelength. Fractions that resulted in both replicates 
giving an OD two fold greater than the mean OD from cells cultured m medium alone, 
plus 3 standard deviations, were considered positive. 

For sequencing, the polypeptides were individually dned onto 
Biobrcnc™ < Perkm Elmer Applied BioSystems Division, Foster City, CA; treated glass 
fiber filters. The filters with polypeptide were loaded onto a Perkm Elmer/Applied 
BioSystems Division Procise 492 protein sequencer. The polypeptides were sequenced 
from the ammo terminal and using traditional Edman chemistry The amino acid 
sequence was determined for each polypeptide by comparing the retention time of the 
PTH ammo acid derivative to the appropriate PTH derivative standards. 

'.sing :he procedure described above, antigens having the follow 
N-temnnai seuuences were isolated: 

(aj Asp-ProA'ai-Asp-Aia-Vai-iie-Asn- fhr- !'hr-\aa-Asn- i vr-Gly- 
Gln-Vai-Val-Ala-Ala-Leu iSEQ ID NO 54); 

( bi \da-Vai-Glu-Ser GivAle: Leu Ala- L cu-Glv 7hr Pro Ala-Pro- 
per >E\ S ;i) \( — " 

' c ) Aia-Aia-\tet-L;. s-Pro-ArL 1 - FhrA ilv- -\sn-< :iv-Pro-Leu-t iiu-Aia- 
Ala-Lys-Giu-Giy-Arg (SEC,) ID NO: 5o), 

id i i'yr- 1'yr-TrpA \s-Pro-Gi\ -i Tm-Prn-Phe- Asfi-PnvAia- rrp-Glv- 
Pro ,SEQ ID NO 

•c 1 A.-;p !le-Giv Ser Gil; Per !~hr 1 "iiu- Msr ■< /in Gin \aa Ma Ya! 
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([) Ala-Glu-GIu-Scr-ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro fSEQ ID 
NO: 59); 

(gj Asp-Pro-Glu-Pro-Ala-Pro-Pro-VaJ-Pro-Thr-Ala-^a-Ala-Ala- 
Pro-Pro-Ala (SEQ ID NO: 60); and 

(h) AIa-Pro-Lys-Thr-l\T-Xaa-Glu-Glu-Leu-Lys-GIy-Thr-Asp-Thr- 
Gly (SEQ ED NO: 61); 
wherein Xaa may be any amino acid. 

An additional antigen was isolated employing a microbore HPLC 
purification step in addition to the procedure described above. Specifically. 20 ui of a 
fraction comprising a mixture of antigens from the chromatographic purification step" 
previously described, was punned on an Auuapore CIS column ! Penan Elmer/ Applied 
Biosystems Division, Foster City, CA) with a " micron pore size, column size 1 mm x 
100 mni, in a Perkin Elmer. Applied Biosystems Division Model 172 HPLC. Fractions 
were e luted from the column with a linear gradient of 1%/ minute of acetonitnle 
(containing 0.05°o TEA; :n water (0.05°o TFAj at a How rate of SO uLrmnute. The 
eiuent was monitored at 250 am. The original fraction was separated into 4 major peaks 
plus other smaller components and a polypeptide was obtained which was shown to 
pave a moiecuiar weight of !2.'J54K.d (bv mass spectrometry) and the following N- 
:erminai sequence. 

i i ! Asn-Pro-Ala-Ser-Ala-Pto-Asp-Val-Pro- Thr- Aia-AlaAjln-Gln- 
fhr-.Ser-Leu-Leu- Asn- Asn-Leu-Ala-Asp-Pro-Asp-Vai-Ser-Phe- 
V^-Vsp ■ <FO ID NO- 'O 
f;ii^ ooivpentiae wa.-> '-now:; :, ; mauce proliferation am: T"\ . production :n PBM< " 
^reparations usmi: the assavs described anove. 

Additiuna, M)iuDte antigens were isolated rrom M. runcrrutosis culture 
filtrate as roiiows. .V/ :uhercuiosis culture filtrate was prepared .is described above. 
f u:ow:;u' diaivv^ .p.u!p-t Bi.-.-Ins propane oulfer. at pH 5.5. fractionation was 
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were eluteci with a linear 0-1.5 M NaCI gradient in the above buffer system at a flow 
rate of 10 ml/mm. The column eluent was monitored at a wavelength of 214 rim. 

The fractions during from the ion exchange column were pooled and 
subjected to reverse phase chromatography using a Poros R2 column 4.0 x 100 mm 
(Perseptive Biosystems). Polypeptides were eluted from the column with a linear 
gradient from 0-100% acetonitnle (0.1 % TFA) at a flow rate of 5 ml/ nam. The eluent 
was monitored at 2 1 4 nm. 

Fractions containing the eluted polypeptides were lyophilized and 
resuspended in SO ul of aqueous 0. 1% TFA and further subjected to reverse phase 
chromatography on a Vydac C4 column 4.6 x 150 mm I Western Analytical, Temecuia. 
CA) with a linear gradient of 0-100% acetonitnle (0.1% TFA) at a flow rate of 2 
ml; mm. Eluent was monitored at 214 nm. 

The fraction with biological activity was separated into one major peak 
plus other smaller components. Western blot of tins peak onto PVDF membrane 
revealed rhree major bands of molecular weights 14 Kd, 20 Kd and 26 Kd. These 
polypeptides w r ere determined to have the following N-terminal sequences, respectively: 

(j ) Xaa-Asp-Ser-Glu-Lys-Ser-Aia-Thr-Ile-Lys-Val-Tlir-Asp-Ala- 
Ser; :SEQ ID NO: '.29) 

* k < Ala-Giy- Asp-Thr-Xaa-ile-Tvr-i ieA'ai-i ;i v- Asn-Leii-Thr-Aia- 
Asp; ;SEQ ID NO: 130) ana 

1 1} Aia-Pro-Glu-Ser-Gly- Ala-Gly-Leu-Gly-GIy-rhr-Val-Gln-Ala- 
* ;iv. • SFO ID N( ) '.?.] , ■.vtieiem Xaa :nav an\' amino acu: 
' Amu tne .lssav described move. :nese poivpentides were mown to :ndace 
proiireratiun .mo IFN-;- production m i'BMC preparation^ :-:e> :A anu H show the 
results oi such assays usmu PBMC preparations from a first and a second donor, 
respectively 

D\A sequences that encode the antntens designated as (a), (c), (d) and 
i^! above 'a ere obtained bv screening a \I tuberculosis genomic librar\ asm*: *P end 
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corresponding to antigen (ai above identified a clone having the sequence provided in 
SEQ ID NO: 96. The polypeptide encoded by SEQ ED NO: 96 is provided in SEQ ID 
NO: 97. The screen performed using a probe corresponding to antigen (g) above 
identified a clone having the sequence provided in SEQ ID NO: 52. The polypeptide 
encoded by SEQ ID NO: 52 is provided in SEQ ID NO: 53. The screen performed 
using a probe corresponding to antigen (d) above identified a clone having the sequence 
provided in SEQ ID NO: 24, and the screen performed with a probe corresponding to 
antigen (c) idenutied a clone having the sequence provided in SEQ [D NO: 25. 

The above amino acid sequences were compared to known amino acid 
sequences m the gene bank using the DNA STAR system. The database searched 
contains some ["3.000 proteins and is a combination of the Swiss. PIR databases aionu 
with translated protein sequences (Version S~). No significant homologies to the amino 
acid sequences for antigens (a)-(h) and (1) were detected. 

The ammo acid sequence for antigen (i) was found to be homologous to 
a sequence from XI. leprae. The full length SI. leprae sequence was amplified from 
genomic DNA using the sequence obtained from GENBANK. Tins sequence was then 
used to screen an Sf tuberculosis library and a full length copy ot' the St. tuberculosis 
homoiogue was obtained (SEQ ID NO: °4). 

I he amino acid sequence ror antigen (j ) was found to he homologous to 
a ivnown St. iunercutosis protein translated from a DNA sequence To the oest ol the 
inventors' knowledge, this protein has not been previously shown to possess T-eell 
-dimuiatorv activity ''he amino acid sentience :or antigen (ki was found :o be related lo 
seuuence ;rom St .errce 

:r.e pr^iiieration and IFN ■ aa,.;a\ , described atw.e. asini: three ?PD 
ptisnp. e donors, the rt-suits tor representative antigens provided above are presented :n 
Tabie 



\V()')9.'42II8 



TABLE I 




In Table !, responses mat gave a snmuiation index ,S1) ot berween ^ ana 
4 spared to ceils cultured , mediuin aione) were scored M _ ^ ^ rf ^ ^ 

concen.at.cn of 1 n or ,ess was scored as , • and an SI of greater than 8 was scored as 
( ■ ^»^^-^mw Mfe « I Kl I oh«v eahigh si ( _ )fcme ^^ 
•ower SI and ror the n,, other donor, , n D0 th proliferation and IFN-v assavs 
These results .d.cate tnese aniens are capable of inducing proton ^ 
inteneron- / production. 



TlMS ™ D ' C :iiustrait:s : *><"'°» ^lantiuens rrom .1/ 
-He tn screen.n, w tth serun, .1/. ;^ ( , ra<ay ,, :moctCli :mimtiuali 

DeSS ' Ca,Cd U — — H-Ra -Due,, Laboratories, JUac , , 
NP*- solunon. and alternately lu „,zed and sonicated three -unes H, 

CSUUmg SUSDCn5,0n ™ ™^ « - !n m ,crofu g e :uoe.s . uu l 

V * ' - m,U0R S W nl <^ ^e Mltratc was bound to Macro 
T=p !)F,VH heads , 3,oRad. Herc:ne, r A: h; , ;u , . A .„.. .. y , ; ; 
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DNase and RNase at 0.05 mgv m ] for .10 min. at room temperature and then with a-D- 
rnannos.dase, 0.5 LVmg at pHI 4.5 for 3-4 hours at room temperature. After returning to 
PH 7.5, the matenal was fractionated via FPLC over a Bio ScaJe-Q-20 column 
(B.oRad). Fractions were combined into rune pools, concentrated ,n a Centnprep 10 
(Amtcon, Beverley. MA) and screened by Western blot for serological activity using a 
serum pool from M tuberculosa^ patients whjch was not immunoreacnve ^ 
other antigens of the present mvennon. 

The most reactive fraction was run in SDS-PAGE and transferred to 

PVDF. A band at approximately SS Kri „.„ ., 

" - - — vitiuuiy me sequence: 

tm, Xaa-rvT^le-Ala-T^-Xaa^Thr-Tlu-Ala^^wl.Aai-Pro-Glv-Lvs- 
Ile-Asti-VaJ-His-Leu-VaJ; (SEQ ID NO: 132). wheran Xaa mav 

be any amino acid. 

Comparison of this sequence with those in the gene bank as described 
above, revealed no .significant homologies to known sequences. 

A DNA sequence that encodes the antigen des lg natcd as ,m> above was 
obtained by screening a genomic M. tuberculosa Erdman strain librarv usu* labeled 
degenerate oligonucleotides corresponding to the N-termmai .ecuence of SEQ ID 
NO:l" A clone was identified hav.ng the DNA sequence provided :„ SEQ ID NO: 
m Th,S 5CqUenCe WaS ;0Und 10 «*«ie the am.no acid sequence provided in SEC ID 
NO: 199. Compare: of these sequences -.v.th those :n the ,enebank revealed some 
s.m.iantv to sequences previously identified ,n M tuhercuios.s ana M W 




IMP 



This example illustrates the prepinnnn it' "in; \ 

."Lj.irjiion oi sequences encoamu 

v antigens by srreemrm ,i 
obtained from patients infected .vt'* \- 



W tuberculosa antics nv screen^ , ,U , v?rt . s ^ 
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Genomic DNA was isolated from the M. tuberculosa strain H3"Ra The 
DNA was randomly sheared and used to construct an expression Hbrarv using the 
Lambda ZAP expression system (Ssratagene. La Jolla, CA). Rabbit anti-sera was 
generated agatnst secretory proteins of the A, tuberculosa stnuns H37Ra. H37Rv and 
Erdman by immunizing a rabbi, with concentrated supernatant of the ,/. luherculosis 
cultures. Specifically, the rabbit was first Irnmuni2ed subcutaneously ^ 2QQ q{ 

protein antigen in a tor.ii vn!„m» „<•->_,,..-• 

. UI , Luiuaiiiing luu ug muramvl dipeptide 

.Caibiochem.LaJoila.CA, and 1 mi ofincorrfbiete Freund's adjuvant. Four we-ks Jate^ 
±c raoou .as .oostea subcutaneous* w.th lOOug antigen in .compete Fund's 
adjuvant, Flnall> , ±e rabbl[ was ^ ^ ^ ^ ^ ^ 

protein anugen The ant.-sera were used to screen the expression Hbrarv as described,! 

iambr °° k CtaL M ° ieCUlar a »™* * ^orarov Manual. Cold 
Laboratories. Cold Spnny Harbor. \Y 1 989 R,„-ri nnh 

1 ' tfJCtenophage plaques expressing 
immunoreactive anti«ens wprr» m ■ , . 

uens were punned. Phagem.d from the plaques was rescued and 

the nucleotide seauenre^ nr' \r , ^ 

^juencc, ot .he A/, cuoercutosis clones deduced. 

W two clones were puntled. Of these. 25 represent sequences that 
:-c no, ,ee„ previous, :aent,ned ,„ A/. ,rotc,ns w er e .noticed bv iPfc 

" a pUnried b% iiutl " n - - Penned ■„ Skcky et aL J E.r P Mca. Hl ^-lSr. 

;-- H >-. Representative: jamai seauenc-s nf nv \ ^ v 

^uuenc.s or n\A moiccuies itltrntitied :n 'his scrct;n ^ 

" ' " ^^nonciny prcaicrai .inmio acid seuuences art- 

XJU — *:t.: rvnown scuuciicl's :n :he w 
-n,: .sing the databases described above. :t was f o„„d that the clones rctcrrcd o 

; rC!naIler " ihRA:A - TbRA1 °' TbR - A1S ' ™ r->R.-\2 l) ,SE0 ID N( )S ^, 



secucnc»s pmwusiv uient.fted in \/tr^..v.vr:;,,« >, r7 .. 
::; l.' •;.■>•... •-, . , . . 
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Piously identified m .V/. :ubercuio Sl s. No s.gmncant homoiog!CS were ^ [Q 
FbRAl, TbRA3. TbRA4. TbRA9, TbRAIO. TMUIJ. TbRAP , n ^ 
TbRA32, TbRA36 and the overlapping clones THRAJ5 and TbRAI2 (SEQ ID 

NOS:64. 78, 32, 83, 65, 68, 76 7^ 76 7Q 81 sn 

/o, /9, 81, SO, 6/, respectively). The clone 

; TbRa24 is overlapping with clone TbRa29 

The genomic DNA librarv ^ m h^ , k „„„ ._ _ JJ: . 
library, were screened os.ng pools of sera obtained rrom Patients , Ilh acuve 
runercu.os.s. To prepare the ID"Rv .ibrary. .V/. ^n,o,, stram H37Rv -enomic 
DNA was isolated, subjected to partial Sau3A digestion and used to construct an 
expression library using the Lambda Zap expresston system (Stratagene. La Jo.ia Ca) 
Three dtfTerent pools of sera, each containing sera obtamed from three ind.v.duals w,th 
active pulmonar, or pleural disease. were used in the expression screening The Pools 
were designated TbL. TbM and ToH. referring to reiattve react mty with H3 7 Ra ivsate 
' T0L = !0U rS3CtiV:t >- ™ ^ m «"«™ reaenvuv and TbH - high react, v.tv, :n noth 

EL1SA ^ !mmUn ° biot [m A :ounh P-i «f*n» :rom seven patients w,th aet.ve 

pulmonary tuberculosis a a, also ernnloved. Ml „ - v , cra ■ . , ; 

■ ^. ..v. ,.ua ,jcklu .ncreaied reactivity 

■-tii the recombinant 38 .1/. :unereuu^ H:-Ra nnosnnate-binuin, protein 

All pools were prc-adsorbed .vah £ ivsate and used o ,cre^ the 

^ 1 : ' Jr0rU:0n - U ' :r ^' ' ^ Spnnyiiarr*,: Atones, .oid > prin , 
V:\ ;o S <j ^aetcMonhau- niauues -xnressir,' --nvanoreaciive , nM .,,^ 

r H iagemiii from the Diacmes r -scneH m < .u > 

wis rescued anw die nucleotide sequences or" the 

A/ :uhL'rrui'cj.s:s clones deduced. 

JllilM ^ " .^e. 'represented .sejeences :ha; 
::ad dr^ Lr-.-n previous. :d— w .... . 
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NO. 43) and TbH-4-FWD (SEQ. ID NO. 44) are non-contiguous sequences rrom the 
same clone. Ammo acid sequences for the antigens hereinafter identified as Tb38-I, 
TbH-4, TbH-8, TbH-9, and TbH-I2 are shown in SEQ ID NOS.: 89-93. Companson 
of these sequences with known sequences m the gene bank usmg the databases 
identified above revealed no significant homologies to TbH-4, TbH-8 T TbH-9 and 
TbM-3, although weak homologies were found to TbH-9. TbH-I2 was found to be 
homologous to a 34 kD antigenic protein previously identified in \i. paratuberculosis 
(Acc. No. S2S515). Tb38-1 was found to be located 34 base pairs upstream of the open 
reading rrame tor the antigen ESAT-o previously identified m M bovis ( Acc. 
No. U34848) and in \t. tuberculosis tSorensen et ah, Inrec. Immun '5.M"10-!T. 
1995). 

Probes derived from Tb38-1 and TbH-9, both isolated from an H37Ra 
library, were used to identify clones in an H37Rv library. Tb38-I hybridized to 
Tb3S-IFZ, Tb3S-IF3. Tb3S-lF5 and To38-lF6 (SEQ. ID NOS: 107, 108, 1 11.-113, ami 
. U). (SEQ ID NOS. 10" and 108 are non-contiguous sequences from clone Tb3S- 
1F2. ) Two open reading rxarnes were deduced m rb38-IF2: one corresponds to PcoTL 
(SEQ. ID. NO. 10 ( ;t, the second, a panial sequence, may be the homologue of Tb3S-l 
and :s called Tb3R-IN (SEQ. ID NO. I 10). The deuueeu ammo acid sequence of Tb3S- 
!F3 :s presented m SEQ ID NO. 1 12. A TblM prone uicnuf.eri three clones ;;i :he 
H3~Rv library: FbH-^-FF ( SEQ- ID NO. 1 0 1 j. whicn mav be me homoioizue of TbH-^ 
<R?-Rai. TbH-^-1 (SEQ. ID NO. 103), and TbH-S-2 iShQ. ID NU. 105) is a partial 
c;one e-r TbH-S. The deduced ammo and saliences :or ::tese three clones .ire -resented 
:: c>h(; ID NOS ;o2. ; ; a .l:u: ;uo 

runner sere en my: >; :ne M :un t r.. -uu\s:s genomic [)N A hbrar. .<> 
descrmeu above, resulted :n the recover, often aduitiunai reactive clones, representing 
scvcr ' Jir]crcm J cnc * °»- ^h«e :^nes was identified js the 38 Kd antigen discussed 
abnve. one was determined to be identical to the UKd Tiph :i jrvsiailm heat shock 
or -cm previously mown :o be present in \i iubcrruit:s::;. and a third was determined 
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TbH-33) are provided m SEQ ID NO 133-136, respectively, with the corresponding 
predicted amino acid sequences being provided in SEQ ED NO: 137-140, respectively. 
The DNA and amino acid sequences for these antigens were compared with those in the 
gene bank as described above. No homologies were found to the 5' end of TbH-29 
(which contams the reactive open reading frame), although the 3' end of TbH-29 was 
found to be identical to the M. tuberculosis cosmid Y227. TbH-32 and TbH-33 were 
found to be identical to the previously identified M. tuberculosis insertion element 
IS61 10 and to the \f. tuberculosis cosmid Y50, respectively. No significant homologies 
to TbH-30 were found. 

Positive phagermd from this additional screening were used to infect E. 
soil XL- 1 Blue MRF'. as described in SambrooK et ai., supra. Induction of recombinant 
protein was accomplished by the addition of IPTG. Induced and unmduced lysates 
were run in duplicate on SDS-PAGE and transferred to nitrocellulose filters. Filters 
were reacted with human M. tuberculosis sera (1:200 dilution) reactive with TbH and a 
rabbit sera (1:200 or 1 :250 dilution) reactive with the N-termmai 4 Kd portion of lacZ. 
Sera incubations were performed for 2 hours at room temperature. Bound antibody was 
detected by addition of ::< I-iabclcd Protein A and subsequent exposure to film for 
variable rimes ranging from :<> hours to : ! days. The results of the :mmunobiots are 
summarized in Table 2. 

TABLE 2 



tinman M. :n \nii-iac 



Tni-T 
TbH-? i 



No reactivitv 
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the human .V/. tuberculosa sera 1S Erected towards the fusion protein. Antigens 
reactive wnh the anu-lacZ sera but no, with the human M. tuberculosa sera mav be the 
result of the human M. tuberculosa sera recognizing conformational epitopes or the 
antigen-anubody binding kinetics may be such that the 2 hour sera exposure ,n the 
> lmmunoblot is not sufficient 

Studies were undertaken to determine whether the aniens TbH-9 and 

Tb38- 1 represent cellular proteins or are secreted into M. tuberculosis culture med.a In 

the firs: study, rabbit sera were raised against A) secretory proteins of u. tuberculosa 

B) the known secretory recombinant M tuhrr^ln^ ««. 

— o^u, .„ , .cLuinoinant 

Tb3S-i ana D) recombinant TbH-o. using protocols substantially as described m 
LX3mple l0UU XL :ub *™los>s ivsate. concentrated supernatant of 1/ tuoerculosrs 
cultures and the recombinant antigens 85b. TbH-9 and Tb3S-l were resolved on 
denaturing gels, immobilized on nitrocellulose membranes and duplicate blots were 
probed using the rabbit sera described above. 

The resu i.s of this analvs.s using control ,era (pane] D and antisera 
( panel II) agams, secretory protein,, recombinant 85b. recombinant TM 8 -> and 
recombinant TbH-0 are shown :„ F;gures 2A . D . respectlvd> , wherem ^ ^ 
.icsi^attons arc as toiiows. moiecuiar weight protein standards: 2: 5 or 1/ 
^ }) 5 - *crctory proteins: 4) 56 n , recombinant TV*-; • <„ „„ 
:CCOmDmant ' 0H - 9: ma * :1 = *5b The recotnbtnant aniens were 

en.ineerea w„h ,x terminal mst.dtne residues and would therefore be expected •„ 
~ ,-„p. , mobihtv ancroxtma.ci, : ,D !hat :hc nam , :mi(c;;; ...^ 

-I r^mbmam FbH- ' ; Mkln , ,„ p „ )XlmalL , y ,, , n ,, ;he :uj; „ . p 

.^nger, :,encc -he SIgnillcan! difference :n the we ,>f the trr.munoreaetr.-c native rhii- 
onnyenm the ivsate iane , tna.cated by an arrow,. These results demonstrate that Tb<8. 
' 1 ? ' K ' g ^ :ntra « !lui " •■»«««* ana arc not actively secreted bv M. tuberculos, 
' ne I:nam " that Ibii "' ) :s "' anncciluiar antigen was ,-ontirmed bv 
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* pro " fm " vc i~» -ombi™, » d , 

octroi * „w/„„ ^ URan. « by m ^ mtll! o , 

««d ^, dme . . descnbed in Esample , ^ b rww 3 ; ^ 

rtspotttts , 0 nH _ 9 stavmg ^ ^ _ s ^ ^ a 

S exponent o{ M ube , culosls Sfcraop; protens ^ ^ ^ 

PBM r r second ™ , " 9 " spcc,fic t cdi cione PPD m ' m *» 

PBMC fa. a nealthy PPD-pos.ttve d ono, fWltowiog samujatton „ f , ht . T ccll cbne 

with secretory proteins. PPD or recombinant TbH-'> TT,esc results fi,„l 

• inese re sulLs rurrher connrm that 

,s secreted by .V; luimmhsts. 

) 

Geno mic DNA was isolated from .V, ^ rcafow Erdman ^ 

7 V Sheared ^ ^ 10 — « - — o e mp ,ov in , tnc Lambda 

^ eXPreSS ' 0n SyStCm ,Strat ^- La CA,. The resu,t,n g hbrarv^ OTed 
-ng pools of sera obtained iron, indl , duals w , h extrapuimm , ^ 

; eSCnbed 3b0Ve * *• -*o* bemg , oat anti-humaii 

■5^ A M (H . I.) coniugaied w,th alkaime phosphatase. 

-ignteen clones were purified nfth,.,-,. - ; 

punnec. ut these. - clones meremaner referred 'o 

as XP14. XP24. \pi ; mil VPV v 

SeqUenCCS 7115 detemined DNA -seauences for XP> 4 ^ xp , . f . 

" ■ - ^snernvciv. with :h» ^ !r v •* v 

provided :r: SEd ID \7K ' < . . - - 

* U - : " s ^' c:ivCiV ^ oreaict- . mnxi0 acar 
— , :or XP;4 ss prov:dca ;n SPQ ID NO: The reverse .omni^ent ot \'Pu 

-as round ,o encode the an, no ac,d science pro,,,, :n SE Q ID NO: 

Comparison of the sequences lor :h» -mammu u 

^i. ^maming .4 Tories (heremalter 
rorcrrea :o .l, XP] -X?f>, \p i '.vdm, v;) ^. vr ^_ ,, T 
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N'OS: [58 and 159. respectively, with the 5' sequences for XP4, XP5 XP17 and XP30 
^7 ^ N0S —Mr. and the 5 wd y sequcnccs for 

**' ^ XPI8 ' XP19 - ^ - d **5 bcng shown ,„ SEQ ID NOS: .64 and 
16,; 166 and 167; ,68 and 169; 170 and 171; ,72 and ,73. 174 and ,75; and 176 and 
177, respective.y. XP, .as fed to ovenap w,th the DNA sciences for TbH4 
disclosed above. The tu„-, engtn DNA sequence for TbH4-XPl ,s provided ,n SEQ ID 
NO: 178. Thts DNA sequence was found t0 contain an open reading frame encoding 
the amino acid sequence shown m SEO ID NO- >-Q n 

TbH4-XPl was found to contain an ooen r^Winc, ., 

3 — uii^ouuig tne amino acid 
sequence show. in SEQ ID NO: ,80 The DNA sequence for XP36 was found t0 
comaui rwo open reading fnunes encoding the ammo acd sequence shown ,n SEO , D 
NOS ,8, and ,82. w Ith the reverse convenient containing an open reading frame 
encoding the amino acd sequence shown in SEQ ID NO: 1 83. 

Recombinant XP, prote.n was prepared a, described above ,n Example 
* W,lh 1 mCtai '° n ^ ^-ogmphy column being employed for punncauon 
Recombinant XPl was found to snmu,ate ce„ prohferat.on and IFN-v prod uction m t 
ceils .soiated from an \f W^Wimmune donors. 



" " lJ!L " rorr l/ i-ramai: strain 

— ,, « . m , JSCl! , ,, nsIi . :c . in expres3ion . . rir i:npjovin i ^ 

— -rcs:„on ^.^A,,^, Js llescnheii , cj(W ;n :;vip , n ,. , 
,00 ' eU 5CrUm ' ima,ned fr0ni -- ,V,,- Inl ected puticms an, thal was shown . 

:; a r: v,,h i/ ww *~ - - —us,, ,„ r , ssea proteins iSkD 

vS-!. TbRaJ. I'bti4 i)pcp .„ ( -v., .. 

■ - J.-1 • ht <^ • • was used to screen :ne expression abra:-. ,s 
^nnea above :!i , xamD|e ; R ;v , ;h ^ 
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(hereinafter referred ro as LSER- 10. LSER-i 1. LSER- 1 2, LSER- 1 3. LSER- 16. LSER- 
18. LSER-23, LSER-24. LSER-25 and LSER-27). The detcrrmned 5' cDNA sequences 
for LSER-10. LSER-11. LSER-12, LSER-I3. LSER-,6 and LSER-25 are prided ,n 
SEQ ID NO: 237-242, respectively, with the corresponding predicted ammo acid 
sequences for LSER-10, LSER-12, LSER-13, LSER- ! 6 and LSER-25 being prov.ded ,n 
SEQ ID NO: 243-247, respectively. The determined full-length cDNA sequences for 
LSER-18. LSER-23. LSER-24 and LSER-27 are shown in SEQ ID NO: 248-251. 
respectively, with the corresponding predicted amino acid sequences being provided ,n 
SEQ ID NO: 2i2-255. The remaining seventeen clones were fmmw >~ 
similani.es to unknown sequences previously identified ,n .1/. tuberculosis. The ' 
determined 5" cDNA sequences rbr sixteen of these clones .hereinafter referred to as 
LSER-I. LSER-3. LSER-4, LSER-5, LSER-o. LSER-8. LSER-14. LSER-15. LSER-P 
LSER-19. LSER-20. LSER-22. LSER-26. LSER-2S. LSER-29 and LSER-30) are 
provided i„ SEQ ID NO 25o-2- . respectively. w,th the corresponding predicted amino 
ac.d sequences for LSER-I, LSER-3. LSER-5. LSER-b. LSER-8. LSER-14. LSER-15, 
LSER-P. LSER-19, LSER-20, LSER-22, LSER-26, LSER-28. LSER-29 and LSER-30 
being provided ,n SEQ ID NO: 272-286. respectively The determined full-length 
cDNA sequence for the clone LSER-0 :s provided in SEQ ID NO: J87 The reverse 
complement of LSER-, ,SFQ ID NO. 2SS» was found to encode the predicted ammo 
:ic:d sequence or'SEO ID NO: 2S9 



W .iw,>r:~M,,vs .-.-.ate wax prcpareci a,, icscn'oeu above ;n Example 1 
r ' 55 ' :CSUiUny matCnai W2S ^ c *'^ated :n HPLC and the fractions screened by Western 
bl ° l SCr ° l0J?,Cai lt! ' VUV w,lh 3 *™n P'^oi :rom M ^m^m-mfected patterns 
• Vn,Ca Sh ° Ul * !i imk ' ' r : -° "^.unoreactniP. *:,h other antigens of the present 
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Bacteriophage plaques expressing i mm uno reactive antigens were purified. Phagemid 
from the plaques was rescued and the nucleotide sequences of the M. tuberculosis 
clones determined. 

Ten different clones were purified. Of these, one was found to be 
TbRa35. described above, and one was found to be the previously identified A/. 
tuberculosis anngen, HSP60. Of the remaining eight clones, six (hereinafter referred to 
as RDIF2. RDIF5, RDIFS, RDIF10, RDIF11 and RDIF12) were found to bear some 
similarity to previously identified M tuberculosis sequences. The determined DNA 
sequences for RDIFZ, RDIF5, RDIF8, RDLF10 and RDIF11 are provided m SEQ ID 
NOS: IS4-IS8. respectively, with the corresponding predicted jmino acid sequences 
ncing provided ,n SEQ ID NOS. 189-193. respectively. The 5" and y DNA sequences 
for RDIF12 are provided in SEQ ID NOS: 194 and 195. respectively. No significant 
homologies were found to the antigen RDLF-7 The determined DNA and predicted 
amino acid sequences for RDfF" are provided in SEQ ID NOS: 196 and 197. 
respectively. One additional clone, referred to as RDIF6 was isolated, however, this 
was round to be identical to RD1F5 

Recombinant RDIF6. RDIFS. RDIF10 and RDIF1I were prepared as 
described above. These antigens were tounu to stimulate cell proiiiexauun and IFN-v 
production in i ceils isolated from 1/ :uhv>Tulosis-\mm\me donors 



'p.ott ; :\ ! )e':':.\ ■> 



f.'.- polypeptide was isolated from tuneicuiiii [iiirifieL 



•An .'./ tuhercuics:.-. 
protein derivative i VP'J i xs vliows 

: > ! , I> w^ prepared xs published with ;omc modiricatu::; seiner 
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Rv strain was grown for o weeks in synthetic medium in roller bottles at 3~°C. BotUes 
containing the bacterial growth were then heated to 100°C in water vapor for 3 hours. 
Cultures were stenle filtered using a 0.22 a filter and the liquid phase was concentrated 
20 runes using a 3 kD cut-off membrane. Proteins were precipitated once with 50% 
ammonium sulfate solution and eight times with 25% ammonium sulfate solution. The 
resulting proteins (PPD) were fractionated by reverse phase liquid chromatography 
(RP-HPLC) using a CIS column (7.S x 300 mM; Waters. Milford, MA) in a Biocad 
HPLC system (Perseprive Biosystems. Framingham. MA). Fractions were eluted from 
the column with a linear gradient from 0-100% buffer (0.1% TFA in acetonitrile) The 
rlow rate was 10 ml/'minute and eluent was monitored at 214 nm and 2S0 nm. 

Six iractions were collected, dried, suspended in PBS and tested 
mdividuallv in M. tuberculoma** guinea P ,gs for mduction of delayed type 
h>pcrsens,t,v,ty (DTH) reaction. One fraction was found to induce a strong DTH 
reaction and was subsequently fractionated further by RP-HPLC on a m.croborc Vydac 
CIS column (Cat. No. 21STP5115) in a Perkin Elmer; Applied Biosystems Division 
Mode! r: HPLC. Fractions were eluted with a linear gradient from 5-100% buffer 
Mi.05% TFA m acetomtnie) with a flow rate of 80 ul minute. Eluent was monitored at 
215 nm Hl " ht : ' racnons '' vcre ^ilected and tested tor induction of DTH :n .1/. 
r^m-rwcn-infected guinea pies. One rractton was found to induce strong DTH of 
about :o mm induration. The other tractions did not induce detectable DTH. The 
positive traction was submitted to SDS-PAGE gel electrophoresis and found to contain 
•'• Mnyi - " ro,em band 0I ' approximatciy : 2 k!) molecular weteht 

This poivpemme. alter referred to DPP!), .va.s ^-.menceu :;(-;:; 
:nc :cmi::,ai ' :sin - -'■ PcrK;n Hlmer Applied Biosvstems Dtvision Procise -:«2 
pr0tCm SeqUenCCr 35 descnbcd abovt; :"ound to have the \ terminai sequence shown 
SEQ !D N ° r »™amon of this sequence with vnown .sequence, ,n the gene 
■ ,ank ' LS licSl ' abrd Jtxne revcaied ™ ^"">wn homoio,:^ Four cyanouen hrotnui-, 
fragment, of DPPD were : S oiateo and found tn -.a-. - -h, ,., , > , IM 
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sequence with a sequence present within the M. tuberculosa cosmid MTY2 1 C- An 
open reading frame of 336 bp was identified. The full-letmdi DNA sequence for DPPD 
is provided i„ SEQ ID NO: 235. w„l, the corresponding full-length am.no acid 
sequence being provided in SEQ ID NO: 236. 



EXAMPf F s 

DNA S FQT TNrFS encoding u r' P^OJLOSlSAm^Fw 

Genomic DNA was .souted from .1/. uberculosu Erdman stram. 

randomly sheared and used to construe an ..ynr-^mn ' : l , . 

L ' ^ -xprcssion library empiovins die Lambda 

ZAP expression system fStratage.ne. La Jolla. CA,. Serum samples were obtained from 
a cv.omo.gous monkey IS. 33. 51 and 56 davs follow, ng infectI0n with A , luherruhsis 
Erdman strain. Tnese sample, ,e, pooied Md used t0 , crcen thc A/ 
Senomic DNA expression horary us,ng the procedure described above ,n Example 3C 

TVVCmV d ° neS W6re ?Unfied Th < *««nined 5' DNA sequences for the clones 
.-eierred :o as MO- 1. VlO-2. MO-U. MO-,. MO-u. V10-26. MO-2S. MO--; V10- <0 
MO u-ano M0-- are provided SEQ ID NO :i,u 22 , K respccIlveiv . ,, h ^ 
corresponding predicted ammo ac.d sequences be,n, prov.ded m SEQ ID N'O -i.v,. 
The null-length DNA sequence of the clone MO- 10 :s provided ,n SEQ ID NO' 232. 

'•v.th !he jorresnoiidmi: oredict"ii im.ru ; 

-"-"'-.la. „mino ^u. sequence "ieme provided ;:i SEQ !D V' 

: • ' T:x DNA sequence tor rhf MH-2" :.; in SHQ !P NO 23- 

• "' Rtfs V " ' M"- -ere : ;, I1IM , v show i ae.re, 

rciateaness and showed some homology to a prev.ousiv ;dent, 1IC a unknown .1, 

:uoerruios:s sequence and to cosmic WJC^v V1n , 

MiL.^ \in-_ wa,s lound :o show somi' 

1 ^ ^r™,av;.v i iones MO- \ MO " and MO-" 
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and MO-34 we re found to show so m e homo.ogy t0 cosrrud SCY21B4 and V/. 
vnegnuuis tntegrauon host facto, and were both found to show some homology to a 
prevtousiv identified, unknown „ tuberculosa sequence ^ ^ ^ 
some homology to M. tuberculosa heat shock protein 65. MO-8, MO-9. MO-10 MO 
26 and MO-29 were found to be haghly re,ated t0 each other and to 
homology to M. tubercular drhydroltpamtde sraccmyitransferase. MO-2S MO-31 and 
MO-32 were found to be tdenttca, and to show some homology to a previously 
■dennfied ,/. tuberculosa protein. MO-33 was found to show some homo,o g y t0 a 
prevtousiv identified 14 kDa .V/. tuberculosa heat shock protein 

Further srud.es us,ng the above protoco. resulted m the ,solat,oi> or an 
aadnional four clones, hereinafter reterred to as MO-I2. MO-13, MO-19 and V10-39 
The determined 5' cDNA sequences for these clones are provided in SEQ ID NO- ^90 
293. respectively, with the corresponding predicted protein sequences being provided ,n 
SEQ ED NO: 294-29~, respectively. Comparison of these sequences with those in the 
gene bank as described above revealed no significant homoloe.es to VIO- ;i > MO P 
MO-,3 and MO-19 were found to show some homologies to unknown seouences' 
previously isolated from M. tuberculosis. 

iiXAM£LL<2 
^L^^S^i 21LU JIL^ L:D!LU Y 



' -xampitf illustrates isolation ,r nv. ....„..„ 

■ '■'■"■.•'■■•■.no.s:.'; antigens bv jcr""'iirv> , ..„.. ■ 

. — .iin, >. a , ( nc: -rress.oi; :,brar, -.v,th .-a Tom V/ 

_, n f cctcd patlcnts [hat ,. v , rc jnown „ hc . tm ^ h a owei o . jhc 

ecomo.nant U tuo.rcu^s antigens n*al , TbRa3. Tb38-1. TbH4. Tb r and ,S kD 

' ' X ' • wn '' rc ""»« trcima,-. stm.n random h 
near,, to an average s„e ,f 2 kh. and blunt ended w»h K;™ ^i-^. ; 
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extract (Novagcm. The resulting library was screened with sera from several .V/ 
tuberculosis donors that had been shown to be negative on a panel of previously 
identified M. tuberculosis antigens as described above in Example 3B. 

A total of 22 different clones were isolated. By comparison, screening of 
the XZap library described above using the same sera did not result m any positive hits. 
One of the clones was found to represent TbRal 1. desenbed above. The determined 5' 
cDNA sequences for 19 of the remaining 21 clones (hereinafter referred to as Erdsn 1. 
Erdsn2. Erdsn4-Erdsnl0. Erdsnl2-18. Erdsn21-Erdsn23 and Erdsn25) are provided in 
SEQ ID NO: 298-31". respectively, with the determined 3' cDNA sequences for 
Erdsnl. Erdsn2. Erdsn4. Erdsnf. Erdsn "-Erdsn 10. Erdsn 1 2-Erdsn 1 8. Erdsn21 -Ercsn23 
and Erusn25 being provided in SEQ ID NO: 31 8-33o. respect, veiy. The complete 
cDNA insert sequence for the clone Erdsn24 is provided m SEQ ID .NO: 337. 
Comparison of the determined cDNA sequences with those in the gene bank revealed 
no significant homologies to the sequences provided in SEQ ID NO: 304, 3 It. 313-315. 
A . j 19. j24. _>26. j29. 331. 333. 335 and 33" The sequences of SEQ ID NO: 298- 
303. 305-310. 3!2. 31o. 31S. 320-^' ^4- -;->(, r?« ;r -ra^^-v 
to show some homology to unknown sequences previously identified in M 
: ub cr raio sis. 

: -a F - 



ii<.L.^r:nN.>FSo'--m.:- m n "rfr*"* -\ r^iAiiri^ 

• x-,:u?)! ' • i:aitra:t -'-' '-' M mass ,pcc:rome:rv <<> :<ieriti!\ futile 

M :ui>frcuii)s:s antiuens. 

in a first approach. M -.uhercuiosis culture filtrate was screened oy 
•Aesterr. ar.aiy^ 'ismg serum from a tuberculosis-infects: tmiivniuai The react.ve 
' A '-' r ' J JX ~ :icJ :ror ' 1 s,ive ' ^Jina: and the .utr.no acu: <cuuences Jetennined 
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the gene bank revealed homology to the 85b precursor antigen previously [damned in 
>/. tuberculosis. 

In a second approach, the high molecular weight region of \{. 
tuberculosis culture supernatant was studied. This area may contain immunodominant 
antigens which may be useful in the diagnosis of M tuberculosis infection. Two known 
monoclonal antibodies, IT42 and IT57 (available from the Center for Disease Control, 
Atlanta, GA), show reactivity by Western analysis to antigens in this vicinity, although 
the idennry of the antigens remains unknown. In addinon, unknown high-molecular 
weight protems have been described as containing a surrogate marker for \{. 
:uberc:ilosis infection m HIV-positive individuals \Jnl. Inject. Dis.. J ."tf: 13 3- 143. 1997) 
; o determine the identity of these antigens, two-dimensional gei electrophoresis and 
two-dimensional Western analysis were performed using the antibodies IT5 7 and IT42. 
Five protein spots in the high molecular weight region were identified, individually 
excised, enzymaticaily digested and subjected to mass spectrometry analysis. 

The determined ammo acid sequences for three of these spots (referred to 
as soots i, 2 and 4) arc provided in SHQ LD NO: 339, 340-341 and 342, respectively. 
Comparison of these sequences with those in the gene bank revealed that spot I is the 
previously identified PcK-I. a pnosphoenoipvruvate kinase. The two sequences 
stated :rom snot 1 were determined to be from two DNAks. previousiv identified m 
.1/ :un t -rrulosis as heal >nocK proteins. .Spot - was determined to be the previousiv 
identified M. tuberculosis protein Kat To the best of the inventors knowledue. 
::e:iner .'ck-. nor the two DNAks iiave previousiv ; >een shown to nave utiiitv in the 
i:.io;ti>^ .;i .1/ :un t >ry:u t ;\;;: infection 
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attached to the amino terminus of the peptide to provide a method of conjugation or 
labeling of the peptide. Cleavage of the peptides from the solid support may be earned 
out using the following cleavage mixture. tnfluoroacetic 
acid:ethaneditniol:thioanisolc:watcr:phenol (40:1:2:2:3). After cleaving for 2 hours, the 
peptides may be precipitated in cold metbyl-t-butyl-ether. The peptide pellets may ther. 
be dissolved in water containing 0.1 % trifluoroacetic acid (TFA) and lyophilized pnor 
to purification by CIS reverse phase HPLC. A gradient of 0-60% acetonitnle 
(containing 0.1% TFA) in water (containing 0.1% TFA) may be used to elute the 
peptides. Following lyophilization of the pure fractions, the peptides may be 
charactenzed using clcctrospray mass spectrometry and by ammo acid anaivsis. « ► 

Tins procedure was used to synthesize a TbM-1 peptide thai contains one 
and a half repeats of a TbM-1 sequence. The TbM-1 peptide has the sequence 
GCGDRSGGNLDQ1RLRRDRSGGNL (SEQ ID NO: 63). 



EX A MPT. F 9 

r .S c OF R E PE F ^v:^ TvT; A> Trc,FN> F OR SERODI.Af-.Nnsis of --^ r--' 

Tins ITxampie illustrates the diagnostic* nroDenie^ of several 
representative antigens. 

Assays were performed :n %-well plates were coated with 200 nu 
antigen diluted to uL m carbonate coating buffer. pH >? o The weds were coated 
■ wcrnight at ~" !f ' «or 2 nours at " C. iTe plate contents were -.-le:: rwnovcu arid trie 
weils were 'Mocked tor 2 hours wnh 2'n. uL of PBS 1 % BSA Alter :ne blocking step, 
me welis were washed five times wnh PBS0d% Twcen 20' u 50 uL sera, diluted 
MOO m PBS Tween 20- BSA. was then added to each weii and incubated 
:or .^0 minutes at room temperature The plates were thcr. washed attain rive times with 
?RST' '."„ Twcen Z' V ' J 
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uL of the diluted conjugate was added to each well and incubated for 30 minutes at 
room temperarure. Following incubation, the wells were washed five times with 
PBS/0. 1 % Tween 20™ 100 uL of tecramethylbenzidine peroxidase (TMB) substrate 
(Kirkegaard and Perry Laboratories, Gaithersburg, MD) was added, undiluted, and 
incubated for about 15 minutes. The reaction was stopped with the addition of 100 /aL 
of 1 N H : S0 4 to each well and the plates were read at 450 nnx 

Figure 4 shows the ELISA reactivity of two recombinant antigens 
isolated using method A in Example 3 (TbRaJ and TbRa9) with sera from 
\f. tuberculosis positive and negative patients. The reactivity of these antigens is 
compared to that of bacterial lysate isolated from \f tuberculosis strain H3"Ra iDifco, 
Detroit, Ml), in both cases, the recombinant antigens differentiated positive from 
negative sera. Based on cut-off values obtained from receiver-operator curves, TbRai 
detected 5b out of 87 positive sera, and TbRa9 detected 1 1 1 out of 165 positive sera. 

Figure 5 illustrates the ELISA reactivity of representative antigens 
isolated using method 3 of Example 3. The reactivity of the recombinant antigens 
fbrU. TbH12, Tb38-1 and the peptide TbM-1 (as described in Example -1) is compaiea 
:o ;hat of the 3S kD antigen desenbed by Andersen and Hansen. Infect, immun. 
a 2481 34SS. 1989. Again, ail of the polypeptides rested differentiated positive from 
negative sera Based on cut-off values obtained from receiver-operator curves. 7b H- 
detected o" out of lit) positive sera. TbH;: detected 50 out of \2: positive sera, 3£-' : 
detected ni out of \t)\ positive sera ;ind the TbM-i peptide detected 35 out of 30 
positive sera. 

i ne reac;;vit\ oj ;our am: Liens ■ ]7t\af, 7">Ra9. T^i jd .ja: "nf ? ; 3 • waa 
- " :unu 1 r rou P 01 - :urrrru tt }\;:; aifectcd patient mUi dilfennc :eactivuv trie 
.ic:d ;ast >uun of sputum .Smithvvick and David. Tuncrcic :I.22'\ 7> 7 > was aiso 
examined, and compared to the reactivity of 1/ :unrrr:uosis "vsai-e and die >SkD 
mtmcn The results are presented :n Table 3. below 
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TABLE. 



&EA C7IVfTY OP ^G EZ S w [TH i S rp a pp p ^ rf .^ r , ^ 



TTENT ^ 



ELISA Values 



Panent 


j Sputum 


Lysate 


38kD 


TbRa9 


TbHI2 


TbH4 


TbRa3 


TbOIB93I-2 


i — ! 

i j 


1.S53 


0.634 


0.99S 


1.022 

i 

1 


| 1.030 


i 1314 

i 


TMHB93I-19 


! i 


2.65- ! 

i 


2.322 


0.60S 


1 0.837 


j I .S5T 


, 2.335 


TbO!B93I-S 





2.703 : 




0.492 


0.2SI 


! 0.501 


! 2.002 


TbOIB93!-lo 


— 


i .ot>5 


1 .301 


D.6S5 


0.21o 


0.448 




Tb01B93I-H 


! ! 


2.sr , 

i 


0.O97 i 

j 


0.509 ; 

I 


0.301 


o. 1 73 


2.60S 


Tb01B93I-15 




..:s ; 


0.2S3 


0^S0S~j 


0.213 


1-537 


0.811 


Tb01B93I-lo 


j . : 
1 i 


2.908 1 


> 3 


0.899 i 


0.441 i 

1 
i 


0.593 • 


1.080 


Tb0!B93I-25 




).395 


0.13! : 


0.335 : 

! 


0.211 ; 

i 


0.107 


0.948 


Tb0IB93I-S~ 




2.o5? 


2.432 


2.2S2 ! 


|).9" ; 


1 n i 


0.35" 




— 


; o j ^ 


2.3 "0 


~* t - . 


0.8-70 




0.952 


i bu 1 B04[- i OS 






1 . 3 4 i 




1 '.3of> 


0.(^4 


.-70s 


TbiM n94[-20! 




i 


■ i a i u 


O.oM 


0. 13" 


9.064 


0.092 


Tb01B93I-S8 






: .3oo 


2.: 10 


: nSi 


0.214 


■'.530 
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TbOIB ( Ui-224 
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.91 ; 


).4~o 


o 3 < ; 
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f 1 2fr> 
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Patient 



Acid 
Fast 



ELTSA Values 



Tb01B93I-22 



SpUtUin i L ^~^D TbRa9 fbHlT^^ 



TbO 13931-3 1 



Tb01B93I-3 



0-714 | 0.451 j 2.082 



0.956 0.490 | 1.019 



0.285 
0.812 



0.269 



0.176 



1.159 



1.293 



1261 i o^6"i 0M8^3Tj^rr^ 



Tb01B93I-52 




j 0.658 


0.114 


; 0.434 


j 0.330 

j 


- I 

j 0.273 

i 


I 

! 1.140 

i 


Tb01B93I-99 




1 2.118 | 


0.584 


! 1.62 


0.119 

1 


1 0.97^ 

i 


1 0.729 


Tb()IB94[-I30 




1.349 ; 


0.224 


' 0.86 


J 0.2S2 


0.583 


i 

' 2.146 


Tb01B94I-l31 




■ 0.685 ; 

I 


0.324 




0.059 


: o.i is 


; 1.451 


AT4-0070 


Normal 


0.072 i 

1 


0.043 


0.092 


0.071 


! 0.040 


0.039 


AT4-0105 


Normal 


0.397 ! 

i 


0.121 


0.118 | 


0.103 


0.078 


0.390 


3.- 15/94-1 | 

i 


Normal 


0.22^ ; 


0.064 


0.098 ; 


0.026 


0.001 

1 


0.228 


4< 15/93-2 


Normal j 

i 


0.114 ! 


0.240 


0.071 | 


0.034 l 

j 


0.041 ! 

j 


0.264 


5 26' 9^-1 


Normal 


0.089 ■ 


0.250 . 


0.090 ■ 


0.046 ■ 


0.008 1 


0.053 


5.26.-94-3 1 


Normal 


0.139 


■U)93 


0.085 


0.01 ( J 


;i .0(r 


O.'O 



Based on cut-off values obtamoi from reccver-oncrn.or curves 7T,Ra3 
—red 23 o Ul of r positive sera. TbRav de.ecteu out ,„ r. n,H4 detected !Sou, 

vou,d :,avc 1 rhcora:cai - - o, r. indlcatm , , haf . hese mtI :. fl> 

comment eacn other ,„ the seroiog.cai dctcc:,o„ of,, !mcctl0r 

lR Jdd,t,0n ' SCVCmi rSC ° mbln - [ detected pos.t.ve sera :hat were not 

Jetcctcd using the 38 kD antiecn. :ndic-»iiic -hr -h,- 

- ■'■ u,t -— ''t -heve antigens may tie comnlerr.entar. 

v '!:r 'N kD .intiqen 
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The result, are shown in Figure 6 which indicates that TbRal 1. while being negative 
with sera trom PPD positive and normal donors, detected sera that were negative w,th 
the 38 kD antigen. Of the thirteen 38 kD negative sera tested, rune were positive with 
TT>RaI 1. indlcalins that tins antigen may be reacting with a sub-group of 38 kD anngen 
> negative sera. fa contest, ir. a group of 38 kD positive sera where TbRal 1 was 
reactive, the mean OD 450 for ThRal 1 was lower than that for the 38 kD antigen The 
data indicate an averse relationship between the presence of TbRal 1 act.viry and 38 kD 
positivuy. 

The antiuen TbRn? A a-rx: r~cr^ ^„ 

~ — - ni^ii^i ^Ldr\ usini: initially u 

° l0I SCmm at 1100 dilution Ibr 30 ™™<* at room temperature rbiloued bv washin, m ' 
PBS Tween and mcuoating lor 30 minutes w.th b.onnylated Protein A ,Zvrneri S an 
Francisco. CA) at a 1:10,000 dilution. Following washing. 50 ul of streptav.du, 
horseradish peroxidase (Zymed) at 1:10,000 dilution was added and the mixture 
incubated for 30 minutes. After washing, the assay was developed with TMB substrate 
as described above. The reactmry of TbRa2A with sera from .V/. tuoemUas, oatients 
and normal donors in show. m Table 4. The mean value for reacuvny of TbRaZA w,th 
.era trom XL tuberous patients was 0.444 with a standard deviation of 0.309 The 
mean tor reaenvuy w llh sera from normal donors was 0.109 w.th a standard dev.auon 

" 02 ■ TCSUn ~ jf ;S kD ™^ «ra .Figure aiso :nd;ea«ed that the ThRaJA 
antmen was capable -n detecting sera m this categorv 





■ ARL;: 4 












Serum ID 


Status 


OD 450 


Tb85 


TB 




1 TbSfc 


TB 




Tb8~ 


FB 
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TW3 


4_I?_ 


1 0.232 


Tb94 


1 TB 




Tb95 


I TB 


U.4j} 


Tb96 


i TB 


n Ira 


Tb97 j 


TB 




Tb99 


I o 


0.328 | 


TblOO 1 


TB 


0.817 


TblOI j 


TB | 


0.607 


Tbl02 } 


TB j 


0.1O] 


Tbl03 j 


TB ] 


0.22S 


TbIOT 


— ™_ I 


0.324 


__ TbI09 ; 


TB 1 


I.57-> 


TblI2 i 


TB i 


0.338 


, DL4-iJIT 6 


Normal 


0.036 


AT4-0043 


Normal | 


0.126 


AT4-0044 : 


Normal j 


0.130 


AT4-0052 1 


Normal j 


0.135 


AT4-0053 i 


Normal 


0.133 


AT4-0062 i 


Normal 


0.12S 


AT4-0070 j 


Normal 


0.088 


AT4-0091 


Normal 


0.108 


AT4-0100 ! 


Normal | 


0.1 Oo 


ATif) 105 ■ 


Normal 


0.108 


AT4-0100 


Normal 


0.105 



Ine rcacimty >f:hc recombinant antigen (-mSEO id NO: of)) wuh sera 

f Cm ~ JnCl lmrTUl Jonor s wns determined hv ELISA as 

described above. Figure S shows *he -suits of fh- h tn„ 

-suns or the titration oi antigen <g) with lour 

A/ luhcrcuiasis positive se-a 'V' ■ — > . , 

1,1 ^ ^^'VKhtne ^ .1.) antigen tUK l with four 

^>™r sera. \;I rour nositive -\ — ■ ^,r, 

ih ° rCaC!,V,U >: "-^"un! Tnii-:«.» SEU -D NO 

with sera irom ,1/ luoerr.itosi' rm^K won 

"aw. paiiom.s. : PI) positive donor; and normal donors was 

determined bv indirect r - 1 ; ^ \ 1L - , 

.L.SA as ..es.noec .move, ['he results arc shown m Figure " 

jetecred 'm out i . (v .. ; 

1 s " v scra - \ PPD positive sera and - 

■mi or 'J normal lCr a 
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OD 450 was demonstrated to be higher with seta trom \{ tuberculosa paaents than 
from normal donors, with the mean OD 450 being significantly higher in the indirect 
ELISA than in the direct EL1SA. F lg ure 1 1 1S a titration curve for the reaetivirv of 
recombmant TbH-33 with sera from M. tuberculosa patients and from normal donors 
showing an increase in OD 450 with increasing concentration of aniigea. 

The reactivity of the recombinant antigens RDIF6, RDIF8 and RDIF10 
(SEQ ID NOS: ! 84- 1ST, respectively) with sera from M tuberculosis patients and 
normal donors was determined by ELISA as described above. RDIF6 detected 6 out of 
32 \1. tuberculosis sera and 0 out of !5 normaJ sera: RD1F8 d««««rl u out of -2 M 
tuberculosis sera and 0 out of 15 normal sera: and RDIF10 detected 4 out of 27 U 
tuberculous sera and ; out of 15 normal sera. In addition. RDIF10 was found to detect 
0 out of 5 sera from PPD-positive donors. 

The antigens MO-1. MO-2, MO-4, MO-28 and MO-29 described above 
>n Example 5. were expressed m F col, and purified usmg a hcxahistidine tan. The 
reacnvitv of these antigens with both .1/. tuberculosa positive and negative sera was 
examined by ELISA as described above. Titration curves showing the reactmtv of 
MO-!. MO-2. MO-4. MO-2S and MO-29 at different solid phase coat levels when 
!eStCa a?a ' nSI : ° Ur A/ ^uiom posmve sera and tour \{ tuberculous negauve sera 
** S110XVT1 Fi » s - :2A --- ^oectivov Three of the clones. MO-!. M0 -2 and MO-2" 
were funher tested on panels of HIV pos.nve/mbercuios.s ; IHYTB) positiv e ana 
extrapulmonary sera. MO- [ detected 3 20 extrapulmonary and 2 38 HIV TB sera. On ' 
t:,C S3me ;cia - rouos - M °- : ^*cted 2 2'. and in ^S. and MO-29 aetecte,: 2 2" : , K ! 
' " Sen :n co: " omatlon ™~ -ones would liavc aetc - : 4 2m ...v.ramiimonar. 
;CrJ ^ ;:S ;iIV TB in audition. MO-' detected * ; " ; era that had nr , vir ., sj , 

nCSn M ° mV 10 rcact Wlth l/ — — ' iysate ano not wuh either - 8 kD or with 

other intiecns of the suhjec: invention. 
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EXAMPTF 10 
^^^^ 



A fusion protein containing 7bRa3, the 38 kD antigen and 11,38-1 was 
prepared as follows. 

Each of the DNA construe* TbRai, 38 kD and T538-1 were modified 
by PCR ,n order to faetlttate the. fusion and the subsequent e.xpress.on of the fus.on 
Pr0tCm ™*- 38 kD ^ 38 - 1 ' ™* 38 U> and Tb38-, DNA ^ t0 pertomi 
PCRusmgthcpnnwsPDM^andPDM^SfSEOIDNO- 141 ,„h D nw . 
PDM-5, ,SEQ m MO: ,43 »d ,44, md PDKM9 _ pDM . 0 „ (SEQ [£) ^ ^ 
I •>,,.. r K pec„v=,v ,„ eadl cas , ^ DNA ,„f M was pmomied „ sma |() u] 

10X Pfu buffer, 2 ul 10 mVi HVTPc -> 1 u r- , 

m " 1 tLNTPs - ~ . ul e *ch of the PCR primers at 10 jaM 

concentrauon, 81.5 water. 1.5 ,u, Pfu DNA pol^crasc (Stratagene. La Jo.la, CA) 

^ 1 ul DNA at other ^0 n^l (for ^ or 5Q ^ ffor 38 ^ ^ ^ ^ ^ 

TbRa3, denaniration at 94T i • 

V4 L was penormed lor 2 mm, ibllowed bv 40 evdes of 96°C 

' 5 * ?2 ° C ' - ,a »* * ^ ^ 4 mm For 38 kD. denature at 
was performed for 2 mm. followed bv 40 eveies of 96°C for 30 sec 6S°C for - < 
sec and 2'C for 3 mm. and finally bv - C ;or 4 min . ?or ^ , cmm ^ gj0 
< :or 3 mm was followed fn- cycies oWr :or - sec. o8<C ;or 1: StfC llnci - : - r ;or 

i -~ mm, 30 eveies of % 3 r f 0 r 1 * -e- ucr „- . - 

*- -'^ 4111(1 v. ior ;mci nnallv bv 

for 4 nun. 



; bR,L; — 1 - — ' -Uh Ndei an. ::coRl and ,ioned 

" N ^ :>f ^ : ragmen; 
nax- Hium ends ,mt: 



~ :L : VC " t0r XSln - ^ ^Rl site. I he ;s ?CR ;raen:cn; 



■iirectlv into p"' !.2 

■•vji ,iii:estei: with SseSjJTI ireaten -vuh * 

aLuicu Aim j 4 polymerase ; 

thCn ^ Wlth HC ° RI ' OI *™ ^ - ^ pT^Rai ! vector wh.cn was 
Rested wnh Srul and EcoRI. r hc PCR ^ ^ ^ 

hC0R1 ' md directlv ^cloned ,mo P r* L^R • ^ S kD - i . t 

l_k_ ^ivi.)-, oiKestcc witn ;he same 

-.-'vmes. rhe whole f,,sK.n was then trans^rrci , F - :8h ., : . , 
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The expression construct was transformed to BLR pLvs S E. coh 
(Novagen, Madison, WD and grown ovemrght in LB broth w.th kanamvdn ,30 u^,) 
and chiorampherucol (34 MSml) . ms culmre (12 mJ) ^ ^ ^ ^ ^ 

2XTT with the same anub.otu, and the cutanc was induced with IPTG at an OD560 of 

> 0.44 to a fina. concentration of 1.2 mM. Four hours P ost-induct,on, the bactena were 
harvested and sonicated in 20 mM Tns (8.0), 100 mM NaCl. 0.1% DOC ^0 m ml 
Leupeptu, 20 ,V, PMSF followed by centnfiigat.on at 26.000 X g The resulting 
pellet was resuspended in 8 M urea, 20 mM Tns (3.0). ,00 mM NaCl and bound to Pro- 
bond nickel resm (Lnvitrogen. Carlsbad, CA). The column w.,< ~> - 

> "with .he above buffer then eluted with an im.da.oie gradient (50 mM . ,00 mM^OO 
mM im ,da,ole was added to S M urea. 20 mM Tns ,..0,. !00 mM N - aG) The duates 
conuunmg the protetn of .merest were then d.aizyed against 10 mM Tns ,8.0). 

The DN'A and ammo acid sequences for the resulting fus.on protein 

(hereinafter re i erred to as TbRa^S kD TM9 ' \ ^ ■ j „ 

j 6 iOJ-I1)j8m) are provided m SEQ ID NO: 14 7 and 

148, respective! v. 

A fusion protein containing the two antigens TbH-9 and Tb38-1 
(hereinafter referred to as T~hHQ tt^q ^ . 

as .bH9-Tb j8 -. , without a hinge sequence, was prepared asin<, 

a similar procedure m that Jesf-nhpfi t- ™- . 

uescnoea aoove. The DN.A seauence for tnc 7hK<J-Tb38-; 

rusion orotein : > provided in SEQ ID \0 ;5i 

A rus.cn arotem contamm, TbRai. the antigen 3SkD. TbJS-: ana DPI- 
was prepared as follows. 

Each of the DS A ,o. Ms TbRaJ. * , 1 : s , , erc ;TX , U1!!C , 

an. coneu :nto , sscn , aiiv is :es , ::iv , , DoVc v:[j; 

: ' lC Ib ' 8 iA ;>3gmCm ^ 3S - iA J,£T - — TWM by a Oral sue at the 3' end ot the 
-n, :e L ,on that Keeps the ,„a, ammo acid mtac, wh„e creating a b.un, restnction sue 

that is \n rrame The r^R a - - v:.m - T -^n , , , 

' Kd ~ ^ } l0l ' ViA lusion u;is then transferred io pFT^ 
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Denaturation at 94 T was n^-rtv,,-™ j c -. 

L was performed for 2 m,n. tollowed by 10 cvcies of 96 "C for 15 

sec, 68 °C for 15 sec and 72 °C for l s m.n -a 

min ^ j0 cycles of 96 °C for 15 sec 64 >C for 15 
~ - 72 T *, ,, m , ^ faally by n x SQT 4 mm ^ Dp£p pcR 

' ' A C0 " SmC ' "' h;C " ^ »- - **L 1*. *» cons Duel was 
— ,„ „e Mmct by DNA scquencmg , Rccomb ,„ m , ^ ^ 

fl—lte ^ ,„ us tw.,, « prov , ded m S£Q , D N0; ^ 

respectively. " ' 

A rus,on protein conta.nmg TbRa3. :he anngen 38kD. H^-l and TbH4 
was prepared as follows. 

Genom.c .t/ DN'A was used ,„ PCR M , CTsdl 

™«, w, ft pnnras p DM _ b - mrf pDM ]M (s£Q ^ ^ 

respectively) and 2 ul DNA at inn no,„i n 

' ^Aatinnn^ul. Denaruranon at 96 »C was performed tor - 
mir., tollowed bv 40 cvcies of 9ft j r w 

. - cycles ot 96 C tor M sec. 61 'C for 20 sec and ^2 X for > -nir 

wnh EcoRI and Sea 1 , Ncw Poland B.o.aos, and Coned "direcfv in* the 
P • -SRoo.^kD/.S-lA construct described above which was dieted wth Dral and 
^oRL The tusi0n constnjc( was confi 

^ ^- UIIt -^ rn J^A sequencing 
Recombmant protcm was prepared , desenbed above. The DNA and am.no ^ 

~ ^ r£SUlt,ng *« ^ ' — referred to as ^ , « ^ded 
:nicQ ID N0: ^.^.respectively 

-Jit. b:' jeoaxate.i •■>•.- 

;;i\lt was preparcu as follows. 

pnvt . ... ;S ^ USCti : ° • nerI "™ P « pnmers PDM-P6 and 

PDM-i > iSEQ TD NO- u~ w us 

- 4 .uk, ,4S. respccnvelyj. and 1 ul PET2SRaJ ^SkD^S- 

' 12 U !: '" ^ Denaturation a, o„ - , ^ 

:o;!owea m 4.', , : . c ie.s of 9* T f or , c ,_ Y - , . _ . . 
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then ramping down to 25 X sJowly at 0 ! T/sec. DPEP DNA was used ,o perfonn 
PGR as descnbed above. The 38 kD fragment was digested with Eco RI (New England 
B.olabs) and cloned into a modified pT7AL2 vector which was cut with Eco V I 
(Protnega) and Eco RI. The modtfied pT7AL2 construct was designed to have a 
MGHHHHHH amino acid coding region in name JU st 5' of the Eco 72 I sue The 
construct was digested with Kpn 21 (G.bco, BR1) and Pst I (New England B.olabs) and 
the annealed sets of phosphorated pnmers (PDM-171. PDM-1- and PDM 17* 
PDM-I74) were cloned in. The DPEP PGR fragment was digested with Eco RI and 
Eco 72 1 and cloned into this second cnnsm.rr wWh 4: , 

- -. . — ..^ vi. 5 v.jii_u Willi ceo 4/ m (i\evv 

England B,olabs> and Eco RI. Ligations were done w,th a hgat.on * from Panvera 
.Maa.son. WI). The resulting construct was digested w,th Ndel .New England B.olabs, 

^ EC ° ^ ^ tnnSferred 10 * mod ^d pET28 vector. The fusion construct was 
confirmed to be correct by DNA sequencing. 

Recombinant protein was prepared essentially as descnbed above The 
DNA and am.no acid sequences for the resulting fus.on protein (hereinafter referred to 
as TbF-8) are provided in SEQ TD NO: 349 and 350. respectively. 



St&imAQXl SlS OF T' -hfr n V ^ 

•'• : ~ ln <''•■"•■>> ■>! ::ie : ; .:;;ior. proten: :":>Ra.3--« 

-s described above. :, :he serodiaraosis ,•' : m ^c-m^ , - • 

■ lhjc.l.josis i.ncc.ior: '.wis examinee b\ 

LISA 

Hie ELIS.A protocol was as descnbed above in Example r, with 'dec- 
ision protein hem- coated at "0<> n<> wen > ^ w t - 

J ' ^ A P^ 1 01 sera was chosen from a croun of 

inercuiosis oatiems rveviousiv sho 
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all toee ep„„pe S tanoncd „„„ the ^ prwe ,„ ^ sto „, n ^ ; . ^ ^ ^ 
■ta reaerec w«, TbRa3 o„,v were d^xabta wi,h Cuslm proK1 , ^ ^ 
reaped „„iv , vlth Tb 38 -1 were a ls „ derecaWe. as were «, „ ,„„ ^ 3S ^ 
alone. He reraamin6 ,5 ser, were a!I P „s, n ve wr* „, ^ proMm b ^ 0 „ , ^ 
5 .IT. d« assay „ f me „ „e sllIv e ! T? ^ ^ ^ ^ ^ 

taraonal acrmrv of all three ep„„pe s ,„ fnsion pr0Km 



REAC-TVTTV Qf TPf-f>n n p _c£r 



Table 5 



Serum ID 



status : HLISA and/or Western 



fusion ; Fusion 
Blot Reactivity with ' Recombinant Recombinant 

™450 Status 



Individual proteins 
J8kd 1^38-1 TbRa3 




39004 
68004 

1070O4 
^2004 
)7()iK 
1 18004 
173004 
1 "50O4 

.^4004 



-I? 

TB 



_0.ti(T 

).o<r 

_L 1^52 
2.Q94 
1258 
2.? 14 



T3 






2 0~ 




TB 






1 675 j 




TB 






1 loo 1 
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308004 



TB 



3 .338 



5 1 4004 



TB 



.362 



? 1 7004 



TB 



0.763 



312004 



D176 



D162 



TB 



PPD 



PPD 



1.079 



0.145 



0.073 



D16I 



PPD 



0.09' 



D27 



PPD 



Ao- 124 



NORMAL 



o.os: 



o.o5: 



A6-125 i NORMAL 



0.0S' 



A6-I26 i NORMAL 



Ao- 12" 


1 NORMAL : 




V.J40 

0.0O4 




i Ad-128 


NORMAL 1 


______ - i 


0.034 




A6-129 


NORMAL i 


- 1 - 1 . j 


0.03" j 




A6-130 


NORMAL 1 




0.05- 


1 

! 


; AM31 


NORMAL ■ 




0 054 




A6-132 ! 


NORMAL 


i 
i 


0.022 : 




Ao- 133 i 


NORMAL ' 




o.:4~ 




Ao-134 


NORMAL 


i 


o. ;oi 




V(>-; 


NORMAL 










NORMAL 




■ A >5- 




ao-:3" 


norma;. 








AO-I3S ' 


NORMAL 




0.f.»4i 




AO- 139 


NORMAL 




■).:n; 




Ao-Uit 


NORMAL 








A^-u; 


NORMAL 




: t > f ^ 






norma; 









i :ic reactive of the fusion protein TbF-2 with sera from .1/ 
:uncrcuto,:s-Mcc:^ patients was examined by LLISA 

.move The results .i 



usinLi the protocoi described 
ve:;e studies (Table o» demonstrate that , ' 



■nr .ir.rii*.e:is lunet:^ 
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Reactivity of TbF-: Fusion Pro™ wrra TB 



and Normal Sera 
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One „, sk ,„ „ „ m| , wmmc ^ ^ ordCT w . ^ _ 

5 construction of fusion proteins. 

From the foregoing, It will be appreciated that, although specific 
events of the inve,** have been described here, for the purnose ommam 
various modification may be made without deviate from rh, ™ _ r , ' 

iO invention. F 
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CLAIMS 

We claim: 

■ A polypeptide compnslng m ^ ^ jo|ub|c 

** ,WWta » - -*» * - N'-leiminal ,e q „ encc sdK[ed 

from the group consisting of: 

(a) •^-^o-Val-.A.-p-Ala-Val-Ilc-.^-Thr-Thr-Cys-.Asn-TvT-Glv-Gln- 
Val-Val-AIa-Aia-Leu (SEQ ED N'O: 1 15); 

(b» Ala-Val-GIu-Ser-Glv-Met-r .en-Al.n.r «../:.,. tu, n_. ., - - 

„ oi;- iu-.-\ia-rro-5er 

(SEQ ID N'O: ! !6l; 

<C ^ a -^-Mc t -Lvs-P ro -Arg-T) 1 r-Glv,A S p-Giy- f Vo-L,,-Glu-Aia-Ala- 
Lys-GIu-GIy-Arg (SEQ ED NO: 17); 

d) ^-^-Trp-Cys-Pro-GIy.Gln-Pro-Phe-.^p-Pro-.AJa-Trp-Glv-Pro 
(SEQ ID NO: 11 S); 

<e» -Asp-Ile-G! > -Ser-Glu-Ser-T: ar .Giu,A S p-(n n -GIn,Xaa-A;a.\ al (SEQ LD 
NO: 1:9), 

«0 Ala-Giu-Glu-Ser-i]c,Ser-r hr .Xaa-Giu,Xaa-IlcAal-Prn (SE Q 'D 
NO: 

,g ' As P- pr ^l-Pro-Ala-P^ 
SertSEQ ID N'O: 121 ). 

<h) Ala-Pro-L:,s-Th r -Tyr-Xaa-Glu^Jiu-i.ct,l.vs-Giy.TT 1 r.Asn:-h r .( }iv 

i.SFQ ID NO 

Leu-Lcu-Asr.-Scr-icMa-A.n-;^ AM.-Vai-Ser-Phc \ >r < F( 
ID MO: ::?);and 

;,> AlJ " ,J1 ' 'My-C.lv-rhr -Vai-Gin-Aia-oiv 

(SEU ID \r... t 
ere::: \aa may be any amino acui 



ro- 
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-■ A polypeptide comprising an immunogenic poraon of an 
■V/. ^rcuJos. antigen, or a variant of said antigen that differs only in conservative 
substitute and/or modifications, wherein said antigen has an N-tenrunai sequence seleC ted 
n-om the group consisting of: 

(a) Asp-Pro.Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-TvT- 

Pro-Gly-Gly-Arg-.Arg-Xaa-Phe; (SEQ ED NO: 124) and 
(1» Xaa.Tyr-Ile-Ala-TvT-Xaa-Thr-TTu--Ata<}^ 

Asn-VaJ-His-Leu-Val; (SEQ ID NO: 132,. wherein Xaa may be any 

amino acid. 

3- A polypeptide comprising an antigenic portion of a soluble 
M. tuberculosis antmen or ■? vin-mt ■ j 

= n, a variant or sa,d antigen that differs onlv In conservative 

substitution, andor motions, wherein said antigen composes an amino acid seouence 

" 3 DNA SCqUenCe *» S™P consisting of the sciences recued tn 

SEQIDNOS: ! . 2. 4-1 1) i ■?.-><; o Win . iQ , .. 

• m ami .he complements of said sequences and DNA 

sequences that hybridize to a sequence recued :n SEQ ID NOS !. 2. 4-10, 13-25 g 4 ^ 
% or a complement thereof under moderately stnngem conditions. 

A P ° i> ' pept,d '-- ^™nsmu an antigenic oonion „, a ;./ ;uner-uou, 
^ " 3 V3r ' im ia,U ^ *« only ,n conservative sunsmunons andor 

moumcauons. wherein said anti.cn compnses an amino acd seauenee encoded bv . ON A 

' - -4v.,^ : , ^u>~2 { K\ ^ i.; " ' ■ - v : ' - - ■ 

" ;aK; '^^v ana L/NA :cc..e::cc, 

nai hyonchze to a .sequence recited :n SEO ID NOS 26-51. 133. !3 4. , 5X .r.S. 

-48-2.^1 "^m^q; — -> , ~ 

3:3. 324. . :s . 33: . 33 , ;md 

^ ' ™ m ? 1 ™™ '-hereof -ander -ruHicratcv con,,::,,,,, 
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according to claim 5. 



6. A recombinant expression vector composing a DN'A molecule 



7- A host eel! formed with an expression vector according to claim 6 

S- The host ceil of claim - wherein the host cell ,s selected from the . Joup 
consisting of £ co/i. yeast and mammalian cells. 



9 - A mCth ° d IOT de,ectIn - 8 '-<™<, mtection :n a b,oio ai caJ 

sample, compnsing: 

<*» contacting a biological sample with one or more polypeptides 

according to any of claims 1-4; and 

(b) detecting ,„ the sample the presence of antibodies that b.nd to at least 

one ot the polypeptides, thercnv dctecrim. w <„a i 

wrto. detecting M. nwerculosis infection in the biological sample. 

10 A method for detect,,,. A/ luh ercul OS ,s mIecUon ,„ a 

sample, compnsing: 

ia> J ° maaing 1 blot0 ^ ^pie .vuh a oon^eodde havme , U i x. 
e^nai sequence selected from the group cottsuung ot sequences provided ,n SEQ ID \'0 
.29 and 130; and 

b * ,ietCCI!ni1 :he :amni!: ^ -^ou.e:; :ha. .nci ,o ,u :., ; „ 



::ie polvnentkies. :hcrc:n ,i-vw^, M ,. w 

■ .■."vr-^ / ,, S( . v :nrcc:io:i ;:i ±c ''im:. > U!C :ii sampi- 

in; nie. cornpnsinu: 

U ' Jontani ^ bioio^cai tannic wuh nn< .„ more poiv-ptaies encoaea 



-i JV-\ seijucnce 



-* st-iecre.: :~m:: f he 
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sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID NOS: 3, 11, 
12, 135, 136, 151-155, 184-188, 194-195, 198, 210-220, 232, 234, 256-271, 287, 288, 298- 
303, 305-310, 312, 316, 318, 320-322, 325-327, 329, 331, 333, 335 and 337; and 

(b) detecting m the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M tuberculosis infection in the biological-sample. 

12. The method of any one of claims 9-11 wherein step (a) additionally 
comprises contacting the biological sample with a 38 kD M. tuberculosis antigen and step fb) 
additionally comprises detecting in the sample the presence of antibodies that bind to the 

3S kD \L tuberculosis antigen. 

13. The method of any one of claims 9-11 wherein the polypeptide* s) are 
bound to a solid support. 

14. The method of claim H wherein the solid support composes 
nitrocellulose, latex or a plastic material. 

^he method of any one of claims °-i i wherein the biological sampie :s 
;e:ec:ed :rom ;he grouo consisting or' wnoie biood, serum, piasma, sahva. cerebrospinal lluid 
and unne. 

'o The method o: ,:iaim :5 wherein the molomcai ample is wnoie biood 

:" cram. 

1 . A method tor detecting \t. tuberculosis infection in a bioiogicai 
iampie. comprising: 

(a ' •■oniaotnu; .anipie with at '.east two oligonucleotide pnmers ::i a 
oi.-:'. merase mam reaction, wherein at least one of the oligonucleotide pnmers is specific for a 
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(b) detecting in the sample j DNA sequence that amplifies in the presence 
of the oligonucleotide pnmers, thereby detecting \f. tuberculosis infection. 

IS. The method of eJaim P i wherein at least one of the oligonucleotide 
pnmers comprises at least about 10 contiguous nucleotides of a DNA molecule according to 
claim 5. 

19 A method for detecting M. tuberculosis infection m a biological 
sample, comprising: 

i a) contacting the sampie wuh at least two oligonucleotide pnmers :n a 
pimmerase chain reaction. wherein at least one of the oligonucleotide pnmers is specific for a 
DNA sequence selected from the group consisting of SEQ ID NOS. 3, II, 12, 135, 13o, 151- 
155, 184-188, 194-195, 198, 210-220. 232, 234, 256-271, 2S7, 2SS T 29S-303, 305-310, 312, 
^ ^ 320-322, 325-32-, 329, 331, 333. 335 and 33- and 

(b) detecting in the sampie a DNA sequence that amplifies :n the presence 
the first ana second oligonucleotide pnmers, thereby detecting A/, tuberculosis infection. 



01 



20. The method of claim wherein at least one of -he oligonucleotide 
pnmers comnnses at least aonut \n contiguous nucleotides of a DNA seuuence selected :rom 
:he group consisting of SEQ ID NOS: 3. : :. 13, 135. !3o. iXa-iSS, ; j ^ ;ox 

12. .No. 31S, 320- ?22. 325 32". 



:0 ) -220. 232, 23^ 2156-2" 1. 28". 2SS. 2<>8-.V)3 



* nc "i--'Od ^ claims :" >r 0' wherein :he ; Moio;;icai ;amme 
seie-cd from me group consisting of ^hoie ^iooc. sputum, serum, plasma, sanv, 
cerebrospinal fluid and unne 
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ia> contacting the sample with one or more oi.gonucieot.de probes specific 
tor a DNA molecule according to claim J; and 

(b) detecting ,n the sample a DNA sequence that hybridizes to the 
oligonucleotide probe, thereby detecting M tuberculosa infection. 

23. The method of chum 22 wherein the probe composes at least about 15 
contiguous nucleotides of a DNA molecule according to claim 5. 

24. a method for detecting M tuberculosis m.'^non ;„ , , 

v. UiUlUyUlll 

sample, compnsine: 

•a) contacting the sample w,th one or more oiiyonuclcot.de probes specific 
for a DNA sequence selected from the group consisting of SliQ ID NOS > :i r 13> i36 
15M55. ,84-18*. 194-195. 19S. 210-220. 232. 234. 256-271. 2S~. 2SS, 293-3^ M^O 
316. 31S. 320-322. 325-327. 329. 331.333. 335 and 337; and 

<b. detecting ,n the sample a DNA sequence that hybridizes to the 
oligonucleotide probe, thereby detecting .1/ tuberculosa tnfection. 



i5C mca,oa °'' ^ ;a,ni :: ■ vncrL-;:i "i.gonucieotiue probe comprises 

selected from the croun 



.east about !5 contiguous nucleotide, M : DN. A seuuenc 



— snng.^EOID N,.,S : ;;. ;, 5 . ;s ,. ; ^ , u , (! . ^ 

234. 25(,-:-i. 2S-. 2SS. ."K.:,,;, ;,k ,^ , , , ^ ^ 

: " anu 



" lc:UoG : " linis -*- " r - J ••vht-rt-::: :r.e bioiouica. -.ampic !S 
mm -he group consisting * whole blood, spmum. .crum. plasma, sahva. 

Tp^uinai fluid .uui unnc 
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(a» contacting the biological sample with a binding agent which is capable 
ot binding to a polypeptide according to any one of claims 1-4; and 

(b) detecting ,n the sample a protein or polypeptide that b,nds to -he 
binding agent, thereby detecting M tuberculosis infection in the biological sample. 

28. A method for detecting M tuberculosis infection ,n a biological 

.sample, comprising: 

(a) contacting ,he biological sample with a binding agent which ,s capable 

ot binding to a Dolvpemide having an \.,pm,,n,i „„„„ 

"" ""i""' 1 - o-io-icu uuin me group consisting 

of sequences provided in SEQ ID NO !29 and : 30; and 

lb> detecting ln the sample a protein or polypeptide that binds to the 
DUWmg agem ' thCreby deWCtin § l/ ^erculosts infection ,n the biological sample. 

29. A method for detecting .1/. tuberculosa, infection in a b.olog.cal 

.sample, comprising: 

■a) contacting the biological sample w.th a binding agent which is capable 

r. binding to a Doivpentide encoded bv i n\i a -,.,„,.„- • . - 

1 ' eU ■ * JNA -^uucn^e selected from the group consisting 

bL<) 10 V0S: 3 - :! - :: - — • 151-155. 1S4-1S8. 194-195. !9K > I o- '20 ^2 

:s" :ss. yj*-u n . „k.; ; ,, - ;: 3i() _ }Z{) _^ J ;n 7.' 

and the complements of said sequences, and DN'A sequences that i.vbndi 

sequence recited in SEQ ID NOS: , !. 12. 125. 13,, 151-155. 184-188. HU-,95. !«8. ; !( 

-4. :^ :ss : , )v :n , ;iK .- :) . , ::r , u ^ n ?2 , ::: :> ... 

,1CtCCt!ny :n ' ;!; ' -™ ■ Pr-tetn ,r poivpemide :rut ,:ki; , ; - t ... 
m LhCrCnV dele "' n S A! ^rculosut infection ,i, the b.ologicai sample. 
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31. The method of anv one of claims 2%29 wherein the binding agent ,s 
poiycionai ami bod v. 



32. A diagnostic kit comprising: 

(a) one or more polypeptides according to any of claims 1-4; and 

(b) a detection reagent. 

33. A diagnostic kit composing; 

tai one or more polypeptides having an \Verni !na l sequence 
:he group consisting of sequences provided in SEQ ID \0: 129 ana 130; and 

(b) a detection reagent. 



34. A diagnostic kit comprising: 

(a) one or more polypeptides encoded by a DN'A sequence selecied from 
the^oupconsmmgofSEQrDNOS::. !I. ;:. ,35. :36. 151-155. 184-188. i*:-,o< )% 

:i 0-220. 23:. ::,4. :56-:~i. 28~. :ss. 298,103. 305.310. 312. 316. 318. 320-322. 3:5-3:- 



and 33 ~ thc implements of said sequences, and DN.A 



sequences that 

:v 0 nuize to a sequence recited ;n SEQ !D NOS. 



< b ) d detection reaizent. 



' ; ai n:s 



'-^'ic:c::: ;hc noivivcrukie: ; arc 



*' K k,t 01 " la!rr " A ' ncrc:n solid support comprises nitrocellulose, 
a rMasric mareru! 
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38. 



The kit of claim 37 wherein the binding agent ,s selected from -he 
group consisting of a^unoglobuhns. Protcm G. Protein A and lectins. 

39. The kit of claim 37 wherein the reporter group , s selected from the 
group consisting of radioisotopes fluorescent om.mc i ■ 

P uorescent groups, luminescent groups, enzymes, b.otm 
dye particles and colloidal particles 



40. 



A diagnostic kit comprising at least rwn n ii„„ n ..,. 





, . - --© .^Guiic pinners, ■ at- 

one or the oligonucleotide primers heme specific for a nv X I • 

^ ^ecmc ror a IAA molecuie according 'u 
jiaim ^. ~ lU 



41- A diagnostic kit according to claim 40, wherein at least one of the 
oligonucleotide pnmers comprises at least abotr in n 

_ . -cast aoou, 10 connguous nucleotide of a D\'A 

moiecule according to claim 5. ' 



42. A diagnostic kit , ompnsing a a[ ^ ^ ^ 

■ com one or the pnmers bcrnu spccinc 'or -> nv \ 

n.SEOIDNOS.-; ^ or a DNA .euuence selected from the group consisting 



^o--:. :s8, :9s-joj, .105. ? io. — 



l ~* ;fl - ''1-55, !84-iS8. !<U-i<)_\ I9S 



-io -221). 



.ma 



— -^-oroint: o ./;a:m V n— ■ 

- ,s "^"< h -ea^ -me \- 

■'■'JKi-eoi:^ onmers comprises ai - 1S * ihmr <m 

*uucnce selected trom the group oon^stinu or SH() ID NOS * i: - ^ , " V." 

ivMSS. :cU-!95, !9S. 210- "0 ^ , _ , — >--...<>. 

" ■■-^:SS.:us-3(r, wo;,. 

' ~" * • ■ -'-v , • ^ and ""^ 
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45 A kit according to claim 44, wherein the oligonucleotide probe 
comprises at least about ! 5 contiguous nucleotides of a DNA molecule according to claim 5. 

46. A diagnostic kit comprising at least one oligonucleotide probe, the 
ohgonucleot.de probe bong specific for a DNA sequence selected from the group consisting 
of SEQ ID NOS: 3. II, 12, 135, 136, 151-155, 184-188, 194-195, 198, 210-220. 232. 234. 
256-2-1. 2S7. 2SS. 298-303. 305-310, 312. 316. 318. 320-322. 325-327, 329. 331. 333. 33.' 
and 35". 

^ A kn ;icconim - : ° da ™ 4 <>. therein the oligonucleotide probe 
composes at least about 15 contiguous nucleotides of a DNA sequence selected from the 
group consisting of SEQ ID NOS: 3, 11, 13, 135, 136, 151-155, 1S4-1SS, 194-195, 198,210- 
220, 232. 234, 256-2"!, 2SA 2SS, 298-303, 305-310, 312. 3 16, 518. 320-322. 325-32", 329, 
53 i. 333. 335 and >3^ 



48. A monoclonal amibodv ;hat binds to a polypeptide 



iccording to aiiv of 



:iaims 1 -4. 



A poiveionai anubodv 



that t)inds to a polypeptide accordmu to anv o 



:;aims ; -4 



.'pepnues accorcm.: 



* :a:i:i> 



-a :umoh protein comprising one or more polypeptides according to 
■ in;, ^ne ofAaims : ~: am: ESAT-o iSi-V- T D VO oo, 
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53. A fiision protein composing one or more polypeptides according to 
any one of claims 1-4 and the M. tuberculosis antigen 38 kD (SEQ ID NO: 150). 

54. A diagnostic kit comprising: 

(a) one or more rusion proteins according to any one of claims 50-53; and 

(b) a detection reagent 
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Titration of Mo-1 antigen with TB 
poertWe and negative sera 








1.400 


i • 345004 i 




'.200 


^■—432004 | 

j-*- 225004 i 
— X— 222-2BCI . 


o 


i.aoo 

0.800 


j— »— A6-134 1 


a 
o 


a. soo 






0.4OO 


i — ; — A6-158 1 1 




a. 200 


A6-1S8 




0.000 



Titration of Mo-2 with TB 
positive and negative sera 




•S010Q4, 
•222-28C 
-5780C4 I 
'462004 ; 
-Ad- 154 ; 
-A6-155 
■A6-TS6 
-A6-1SB 



tooo 



500 



nq 



Q00 



Titration of Mo-4 with TB positive 
and negative sera 



Titration of Mo-28 with TB 
positive and negative sera 




-373004 
-±32004 : 
•222-1&C • 
'37S004 . 
■A6-!54 
■ A6-i 55 
- A6-'S5 




-4-47CQ4. 
•43200* 
•563004 
-222-28C : 

- A6- * 54 

- A6-'S5 

- 46- 1 5 6 
• ;6- * 



300 ' QQQ 

ng racomtotrum/Wetl 



500 



:cc 



nq racom&inanVwwi 



Titration of Mo-29 with TB 
positive and negative sera 
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SEQUENCE LISTING 



■I: GENERAL INFORMATION : 



APPLICANTS: R ee d, Steven G. 

Skeiky, Yasir A.W. 

Dillon, Davin C. 

Campos -Neco, Antonia 
Houghton, Raymond 
Vedvick. Thomas S. 
Twardeik, Daniel R. 
Lodes, vichael J. 
Hendriokaon , Ronald 



NUMBER 



COMPOUNDS AND METHODS ? r. B 
~U3ERCULCSIS 



'CUENCZS : 



:rss^ 



^or.f S E = 



6100 Co iambi 



and BERRY LLP 



i e a r_ : 
IV a s r 



a Zenker, 70: Fifch Avenue 



:rv: USA 
33::4 - - ■ 



MEDIUM TV 3 * 



LPM 

Loppy diak 

DCS /MS- 



A ap pl ; cat: cm number 



Mar;i 



- NEC P MA"" * 

"ill 4, 

3 : - ^ 3 ; 
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(DJ TOPGLOGV: linear 
.>:;. SEQUENCE DESCRIPTION: SEC ID MC : 1 : 

CGAGGCACCC GTAGTTT3AA CGAAACGCAC AATC3ACGCC CAAACGAACG GAAGAACVGA 
ACCATGAAGA TGGTGAAATC GATCGCCGCA GGTCTGACCG CCGCCGCTGC AATCGGCGCC 
GCTGCGGCCG GTCTCACTTC GATCATGGCT GGCGGCCCGG TC3TATACCA GATGCAGCCG 
3TCGTCTTCG GCGCGCCACT GCCGTTGGAC CCGGCATCCG CCCCTCACCT CCCGACCGCC 
3CCCAGTT3A CCAGCCT3C~ CAACAGCCTC GCC3ATCCCA ACGTGTCGTT TGCGAACAAG 

CATCGGGGGC ACCGAGGCGC GCATCGCCGA CCACAAGCTG 




JGGTGACGAA ^atccagccg 



5C 
120 
18J 
24C 

1 0 0 
360 

t a-: 

:aatgaa ggcggcttgga tggtgtcagg GGGAGGGGGG 540 

TTGA 5 0 C 

GGGGGGTGGA GCAGGGTGCG 6 6 0 

GG^CGGAGGG AGGGGGGGG7 GGAAGGGGTG G^G GAGA TAG 3TGG7G:;G7G ^ 

--aggagng axgaggggg:; :;::tgg:cc:^t tg~g3:^g— g:;atga 
: :::g:r:-!at:c:; ?g? ..^ec :g :;g : : ■ 

: JGCGENGG '"iARAGTir.lGTr^G . 

A G £:«'"-: ~00 :as- -a.:;; 
'G TYPE : r :ii:^:; ac;?. 

G , G TT^AIO EX :;e S 3 .~ : r. - ^ e 
- — T G PC GY . ; r.ea r 

:g;:;g^;g^ gesg?:?— ^? ; - 



GTGAGGGAGA AG3TGAGG77 ( 
A7GGAG77GG ZGZAGGCZGG AGGGNAAG7G ATT GG CGGG G GGGNTTCAGG GG 
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GC~~ GCGC7 G3CCGGGATG TCGATCGGGG CGGTCCTCCG ACCTGCTACG ACCSGAT7TT 

—TGATGTC CACCATCTCC AAGATTCGAT TCTTGGGAGG CTTGAGGG7C NGGGTOACCC 

GCCCGC3GGC CTCATTCNGG GCTOTC3GCN GGTTTCACCC CNTACCNACT GCO.CCCGGN 

TTGCNAATTC NTTCTTCNCT GCCCNNAAAG GGACCNTTAN CTTC5CCGCTN GAAANGGTNA 

NT C CTNGAAN CCCCNTCCCC C^ 

.'2 1 INFORMATION FOR SEQ ID MO : 3 : 

•1. SEQUENCE CHARACTERISTICS: 

A) liZNGTH : 313 case pairs 
3) TYPE: nucleic acid 
0. S7RANDEDNESS : Jinale 
0: TOPOLOGY: linear 

.xi. 3EQUEK CE DESCRIPTION: SEQ ID MO : 3 : 

CATATG SATC ACCATIACCA 7CACACTTCT AACCGCCCAG C~-~z—r— 

-wvLGCGACA CC3GGCCCGA 7CGA~*~~ v-^™-,^- 

..^.^AG.C .GG7CAGGCA TCGTCOTCAG 

A.v.i 4 . -^A_. GAGATATGGC GGCAA7CCAA TCTCCCG CC~ 
jCGGCCGGCG gtgctgcaaa STAC ~- 



j AT TAAATGTCI 

-^-1 rrt.M : . ;T7I _"~ ] 



w vj.'vrvTT GAATATSAC" ~'" ; G""AC > 



* ^ .j * » o ^ .-vAGGTG^ 



5-10 

50C 

660 

720 

751 



130 
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TOPOLOGY: linear 

[Kli SEQUENCE DESCRIPTION: SEQ ID NO ; 4 : 

GGGTATGAAC ACGGCCSCGT GGGA-AAG— — r— 

— A^-G.C CAGGGTGGGC AGGGA7TCGC 

CATTCCGATC GGGCAflGCGA TGGCGATCGC GOCCCAGATC CGATCGGGTG OOOGCTCACC 
CACCGTTCM ATCGGOCCTA CCGC^CT CGGCTTGGGT GWGTCGACA ACAACGCCAA 

OGGC GCACGA GTCCAACGCG TGGTOGGGAG CGCTCCGCCG GCAAGTCTCG GGATCTCCAC 

7GGCGACGTG ATCAGGGGG^ — 

— _«m^ jv _ TGCGATCAAC tcggccaccg ogatggcgga 

igcgcttaag gggcatca"^ ^~GG~~Ar~~ 

^ —-- — -3 aactggcaaa ogaagtcggg 

-GCACGCG7 ^GOGAACG TGACATTGGO OGAGGGACCC CCSGCCTGAT 7— 



IMFOR.MATION FOR 3 EC ID NC : 5 : 

sequence g:^\kacter:st:c5 : 

f A ' ^JGTK: 504 base pairs 
■3. TYPE: nucleic acid 

S TRANS ED ME 5 5 . sir.cl- 

TOPOLOGY, linear 

sec::e:;:z oey yp: ptio:; . sec :l — : 5 

- -* ^ j'«_:TEG O YGAG ""aTGT^G"'^" " - - ~ ^ 



-v^^ ..„-^ w GG0G 0GAGT3AGGA OGOGGO: 

:gg ::gagcggggg aa-^g~" — 




:;ngy:;a;; og:;gtyy:;a:; — nnn;;tyy togncganat "ananag:;; 



-;:::o; a^ngng-;:; yynaanaany ^tnanngnn. 



60 
120 
180 
240 

300 
360 



'GTGGYGG 4 ;o 

lAOOGG YGGGGGGGGG 



44' 



TGATGNGA 4 3 

n N AG iJTNGNT 54; 
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iA. LENGTH : 63 3 base Da its 

(B; TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID MO : 6 : 
TTGCANGTCG AACCACCTCA CTAAAGGGAA CAAAAGCTNG AGCTCCACCC CGGTGGCGCC 
CGCTCTAGAA CTAGTGKATM YYYCKGGCTG CAGSAATYCG GYACGAGCAT TAGGACAGTC 
TAACGGTCCT OTTACGGTGA TCGAATGACC GACGACATCC TGCTGATCGA CACCGACGAA 

--3GGTGCGAA CCCTCAC^"-*"" "inr^'^r-. ^, 

AAL___ ^GTCCCGYA ACCCGCTCTC 3CCGGCGCTA 

3GGGATCGGT TTTTC3CGGY GTTGGYCGAC GCCGAGGYCG ACG a CG ACAT CCACGTCGTC 

^TG _ 3CC3GACTGG ACCTCAAGGT AG CTCGCCGG 

;~.-vortCC3C3 GTGCCCGACA ""CT OA C 

CGCGG TCACCGGCGG GC^CGAA — 

i "-^^ L -*'^ o^o-.. . AGI GCGACATCCT 

:gGCt ^3GGC tgctgcccac 

JGNCTGGGCC GGTGGATGAG 



CGCGATCAAC GC 



^hu^o^c: Gu_concga gacccacg 
ctggggactc agtgtgtgct tgccgcaaaa - — — - — ■ 



jACC3g:: -agtaggtgt CCGTGACGGA G 

INFORMATION FOR SEQ ID ; JC - - 

-. GEC'IENGZ '-'HARACTERISTI GG 
A' LENGTH: 13 62 oase ^ 



: TP AND ED NESS 



:polgc 



linear 



60 
120 
130 
24 C 



-50 0 



18 . 



-jMi.. i 'j^^ . j.-. ~ ~» ^. _ GGCTGG 
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6 



-w^^>.ju w ^ GGT72GGGCC'"' GCJ - — ^ 

^.o..^, .C.^ATACC TGGGCACGGC GGTGCAATTC S 40 



^ ArCG CACGCCTGG7 CC7GG73CT3 3TGGAC3AAA CC77CCTCCC GGCGGGCGCC 
"GGGCCAAC AGCTCVTGCG GCGC3CG3GT 3GACTGC7G7 7CGCCCGCAA GGTGC3CGGG 
3AGCA7CGGC CQWCCQCTC CACCCSCGGG C7CGAGCCGC GAACGCTGCC CGACGATC7G 
GCA73GGCAA CACCG7CCGA GCCCA7AGCA ACCGCGTTCG CCGCGCTCAG CCACCACC7G 
GACACC3C3C CGCACCTTGCC CCCACCGAC7 CG7CAGG7GG 7CAGGCGGG7 CGTGGGGTC3 
T3GCAC3GC3 AGCGAATGCG GA7GAGCAG7 CGCTGGACGA ACGAGCACAC CGCGGAGCTG 
2GG3GG3AGG 7GGACGCGGG GACGGG7G7 
G7GACGGACG ACGACG7G 



500 
560 
720 
780 
840 
900 

ICCTGGC CGGGCA7GAG 96 0 



~° ^ -GGCGGA rGGGTGCTCG ACACCGAT 
-^^~>_G73G GG7GGGCGGC JTT""'"^'^^' 



3-C3CCGAGG GCCAGGTGTC GC3GCAAAAC CCGAC7GGG7 GAGTG7GC3C GCCCTG7C3C 
:GCTGGCCC GAGGGA7CTC GCGGC3GCGA AC 3GAGG7GG CGACACAGG" 



■^j^GvtCGCCG — ^, , r ^ 

JrtrtGG.-^rtu GAACGTGG GGTGACGG; 



^formation ?cr se: — ::: . , . 

- iEQ'JENCE G!LAi?AG7-p:3T: 
TOPOLOGY. l.::ear 



102^ 



^ J ^ L "--° ^^ 0 .r^.,._^ XQ8C 



1140 

120C 

- ^- - — A G7 GG 0~* 7 3 G G G G G \A ^ ^ rt " , „ ^ ^ 

^u.o.iCGGT 7GGGGGGAG7 1260 



I32C 
136: 



WO '>>> 42118 



CGA773AGGA T7C3G7GCAA 7C3A7C7TTG J3ACGCTGGG ACAGGCCGCC GAGCTGCAGG 
3GGG7GGAGG GGGCACCGGA TATGCGTTCA 3GGACG7GCG ACCCGGGGGG 3A7GGGGTGG 
GGTCCACGGG C3GCAC3GCC AGCGGACG3G 7GTCGTTTCT ACGGCTGTAT GACAGTGCCG 
CGGGTGTGG7 CTCCATGGGC GG7CGCGGGC GTGGCGCCTG TATGGCTGTG CTTGATGTG7 
CGCACCCGGA 7A7C7GTGAT TTCGTCACC3 CCAAGGCCGA ATCCCCCAGC GAGCTCCCGC 
ATTT CAAC CT ATCGGTTGGT GTGACGGAC3 GGTTCCTGCG GGCC3TCCAA CGCAACGGCC 
7ACACCGGC7 3G7GAA7CCG C GAAC C GG CA AGA7GGTCGC GCGGATGCCG GCCGCCGAGC 
7G77G3ACGC CA7G7GCAAA GCCGCGCAC3 GCSGT3GCGA TCCCGGGr— -TGTTTGTCG 
AGACGATCAA 7AGGGGAAAC JGGG7GGC3G 3GAGAG3CC3 3ATG3AGGC3 
2C3GCGAGG7 GCCAC7GC-G CCTTACGACT 3A7G7AA7C7 CGGC7C3A7G 



— CG.-.CG3 . GG3 G i CGACT3GG ACCGGCTCGA GGAGG7GGGG GG7G7GGCGG 




"-"j JGG77CC7 7GA7GACG7C A7CGA7G7CA ; 

-GGCCCGCGC GACCC3CAAG A7CGGGG7G3 JAC7CATGGG T77GGCGGAA 

JAJ73GG7A7 7GCG7ACGAC AG7GAAGAA3 3GG73CGGT7 AGGGACCCGG J73A7GC3: 

r;.v:'ACAGC-, :GC3GCGCAC ACGGCA7—J JGAGGC7GGC CGAAGAGGG3 3JGGC-.TT: 

■ . , _ . JGJCGCGAG GGGCAA^GGA .'AGG73AG: 



- GT: : . - >;2 j a s '* r a ; ;- ;-. 

- ------- ? 

-.\jnn^._„. . . , , ; _ A~27ACC0AI 



54C 

6QQ 

720 

94 0 

?Q 0 



: : s : 
i : 4 ; 
12 oc 
12 5 C 
132 : 

:. : 3 : 



\V() W 42118 



K I I S')9 0326? 



"3CGGTGCA GCZGC^CCG GTCCTCAAGG AA GG GG A CGA TTGCCCCGAT TCGACGCTGG 
C C 3 T C AAA GG TTTGACCAAC GCGCCGCAGT ACTACGTCGG CGACCAGCCG AAGTTCACCA 

TGGTGGTCAC CAACATCGGC ~~GGTG— 

^ — ^-^A CGTTGGGGGG GCGGTGTTGG 

GGGGCTACGT TTACTCGCTG G ACAACAAG ^ — — ^ ~. 

jrtUiALAAGv. ^.iGxoGiC CAACGTGGAC TGCGCGCGCT 

CGAATGAG.\C GCTGG^CA^d ar^^-r^- 

■^-vJUU, ACs^TTTCC. GCGGTGAGCA GGTAACGA GG GCGGTGACG7 

GGAGGGGGAT GGGATCGGCG CCGCCCTGCC CA7TGCCGC~ GG— r- a — 

— - ^C^^GGA.C GGGCCGGGCA 660 

GCTACAATCT GGTGGTACAA GTGGGCAATC TGCGCT^ 



-oCTCGCT GGGGGTTCGG TTCATCCTG; 
l " rt " WjCC 3C ~ C ™~ GGG GGGGTAC 3CGCTCCGGG TCCAGCGCAG GCGCCTCCGC 



-j^GCAAGGC J G A T AA TT A T '"G;-^- — ■ t-~^-h~~„™„ 



::;fcp>!at;gn fo? sev id :jo::o. 

SEQUENCE CHARACTER I ST ICS: 
M ^SNGTH : 622 base cai-s 
31 n UC ; eic acid 

CI 3 TRANI3 ED ME S3 : single 
w *'^"-^^0GV: linear 



^v. _r-iA ^ ^jACA^A 



3 SO 
420 
480 
540 
60C 



780 



362 



-AGGAT 1 4 o 



WO ^0 42118 



PC T'l'SW 03265 



.A; LENCTIi: I23C base pairs 

;3/ TYPE ; nucleic acid' 

;C} STRANDEDNESS ; single 

,Di TOPOLOGY: linear 

ixi- SEQUENCE DESCRIPTION : SEQ ID NO : 1 1 : 

GGCGCAGCCG TAAGCCTGTT GGCCGCCGGC ACACTGGTGT TGACAGCATG CGGCGG7GGC 

AC GAA GAG CT GGTGGTCAGG CGCAGGCGGA ACGTCTGGGT CGGTGCACTG CGGCGGCAAG 

AAGGAGOTCC ACTCCAGCGG CTCGACCGCA CAAGAAAATG GCATGGAGCA GTTCGTCTAT 

TGCTACGTGC GATCGTGCTG GGG CTACACG TTGGACTACA ACGCCAACGG JTCCGGTGCC 



GAGGGGTGC3 "TTCGCGGGC 



- v_.jrtCGGTGT TCGGCGCGAT ^GGGA^^C^ T\ - * ^ ^ , „ -,^„^„, T ,^, _ ^ ^ _ _ 

~ ^^-^ -*\G\jo^GTGAG CACGCT 

-.TGA2GGAG CCACTAGCGG C AAGATTT7 " 



— vj ^. f. w. 



- ~^^-^AC C * GGGGGCAA GACGGATTAG 
J C GAGA AGTCCGG7AC GTGGGACAAC TTCCAGAAA? ACGTGGACGG 



-jo^ -i CAAAGGC Z 



j AAA C 3 



w-JO'^ ^.rt GC 



£0 
120 

iac 



GGGGTGACCC AGTTTCTCAA CAACGAAACC GAT—CGCCG GCTGGGATGT :CCGTTGAAT 3 00 



600 



wo og/421 18 



10 



P(TTSy<> -03265 



,:<!■ SEQUENCE DESCRIPTION: SEC 12 NO : 1 2 : 
GCAAGCAGCT GCAGCTCGTG CTGTTC3AC3 AACTGGGCA7 3CCGAAGACC AAACGCACCA 
AGACC3GCTA CACCACGGAT GCCGACGCGC TGCAGTCGTT GTTCGACAAG ACCGGGCATC 
CGTTTCTGCA ACATCTGCTC GCCCACCGCG ACGTCACCCG GCTCAAGGTC ACCGTCGACG 
GCTTGCTCCA AGCGGTGGCG GCCGACGGCC CCATCGACAC CACGTTCAAC CAGACGATCG 
CCGCGACCGG CCGGCTCTCC TCGACCGAAC CCAACCTGCA GAACATCCCG ATCCGCACCG 
ACCCGGGGCG GCGGATCCGG GACGCGTTCG TGCTC3GGGA CGG7TACGCC GAGTTGATGA 
-GCGCGACTA CAGCCAGATC GAGATGCGGA 7CA7GGGGCA CCTGTCGGGG GACGAGGGCC 

ogatggaggg ottgaagago ggggaggagg ggtattggt^ gg^g^^gt^^ — 

GTGTGCCCAT C3ACGAGG7C ACC3CCGAG7 TGCGGCGCGG 3GTCAAGGC3 ATGTCCTACG 
GGC7GG7TTA CGGGTTGAGC GCCTACGGCC TJTC3CAGCA GTTGAAAATC 7CCACCGAGG 

AAGOGAAGGA G GAG AT GG AG 

^ — — — ^aTTCGG GGGGG7GCGC GACTACGTGG 

ggg::gtagt ggagcgggcg gggaaggagg ggtagaggtc gagggtgctg gggggtgggg 

GG7AGGTGGG GGAGCTGCAG AGGAGGAAGG GTGAAGTGGG GGAGGGGGCG GAGCGGGGGG 
^_.A7GGAG ZZC^GZ-Z^ " G G A C A T G A GAAGGTGGGG ATGATGGAGG 




^-G.GG 7GG.~GAGG_ OTGGGGGGGG GAGGGGGGGA 



;ggggaa gtgggggat~ 



ogcgengg gh.^rajt^=:gt:gg 
a length : : 3a . . 

07? A*JG FGN'rG ■ • 



so 

120 

iao 

240 

300 
360 
420 
•; 3 C 

54 0 

60 0 

660 

"20 

"30 

^4 ^ 

-0 : 



WO 99/421 IS 



PC I I S9<> 03265 



GGTZGGGGAZ CGTCAAACAG TTGG7GCTCA ACGACGGGGG ATTGGTGCGC 



ATGGAAGACA GGGAGGGACG CGGCCAGCCC GGTGGAACGT CGATTTACC 



JTGCTGCGC 130 



240 
300 
360 
4 2 0 

480 
54 0 



GGCCGTCGGA TGCGGATTCG GCAGCTTCCC GGTGGGACGG CTGCGGGTG3 GAGCAGGGAC 
ATCSAGAACT CTCGGC-GTTC GGCGAACSTT ATCTCAGTGG AATCTCAGTC CACGCGCGCA 
ACC1AG1TGT GCAG ITACTG TTGAAAGCCA CACCCATGCC AGTCCACGCA TGGCCAAGTT 
GGGGG GAGTA GTGGGGGTAG TACAGGAAGA GCAAC GTAGC GACATGACGA AT CACGCACG 
GTATTGGCCA ZZGZZGCAGC AGCCGGGAAC GGGAGGTTAT GCTCAGGGGC AGCAGCAAAC 
GTAGAGCGAG CAGTTCGACT GGCGTTACCG ACCGTCGCCG CCGCCGCAGC GAACG GAGTA 

.jAGGATGACG zz::zz~zz7g GGATGGTTCG GGAAGGGGGT ggtggaggga tgttggcgat 

23GGGGGGTG ACGATAGGGG TGGTGTCCGG GGGGATGGGG GGCGGGGCGG GATGGGTGGT 

GGGGT7GAAG CGGGZACZZG ZZGGCZZZAG GGGGGGGGCA GTGGCTGCGA GCGCGGCGCC 

AAGCATCGCG G CA G C AAA CA TGCCGCCGGG GTCGGTCGAA GAGGTGGCGG GCAAGGTGGT 

AAAGCGATCT GGGGGGGGAG TCGGAGGAGG GGTGG jG GAT 

:gt gcggaggggg tgatgttga- CAAGAAGGAG GTGATGGOGG GGGCGGG2AA 96.: 

:tg gggagtgggg :gcggaaaac gagggtaagg gtctgtgagg ggcggacggc 

;^g gtga:gggga tztzzzzggz — gaggtgagg 



^3C 
34 0 

90 0 



:2o: 



WO 99 421 18 



PC I I S 4 >9 03265 



CTGATGAAGG TCGCCGCGCA GTG TTCAAAG C 
;C INFORMATION FOR SEQ ID NO: 14: 

SEQUENCE CHARACTERISTICS : 
(A) LENGTH; 1C58 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

X1 SEQUENCE DESCRIPTION: SEQ ID NO : 14 . 

-TCCACCGCG GTCGCGGCCG GTCTAGAACT AGTGGATCCC CCGGGCTGCA GGA 

ACGAGGATGG GACGTCGCAG GTTGTCGAAC CCGCCGCCGC GGAAGTATCG GTCGATGCC: 



GAATGA7GG AAAACGGGCG OTGACC 



ATCAUGATGT TC . . CGGGGA AACG7GATGG GGAGGAACAG GGTG1 



ZACTC 



-rtAAAGGGTT GACGAGCGTG GAGGTAGC 



* • — JubL ..A/ 



-i .nj\G vjG GGTA' 



^j^^^o^,^^ --^^-v.. --wG A GAGGA GGCGGGCAAT 77GGGGGGGC ' sr- 

-.jij^ort^ju GGAGCGGCGG . AA 7 G GC^GGA ^^jp^^^^ 

^^OunGo^ vjoGCAGTCAT GC~^AGGG^ 

-^^..TTCC GCCTGCGGGC C0AT77GAGA ATCGAGGTAG TG AG CGG AAA 3Q0 



48C 



: o: 



WO W42118 



PC T 1 -SOW 03265 



[E. TOPOLOGY : hnear 
;xiJ SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GAATTCGGCA CGAGAGG7GA TCGACATCAT CGGG AC CAG C CCCACATCCT GGGAACAGGC 
GGCGGCGGAG GCGGTCCAGC GGGCGCGGGA TAGCGTCGAT GACATCCGCG TCGCTCGGGT 
CATTGAGCAG GACATGGCCG TGGACAGCGC CGGCAAGATC ACCTACCGCA TCAAGCTCGA 
AGTGTCGTTC AAGATGAGGC CGGCGCAACC 3CGCTAGCAC GGGCCGGCGA GCAAGACGCA 
AAATCGCACG GTTTGCGGTT GATTCGTGCG ATTTTGTGTC TGCTCGCCGA GGCCTACCAG 
3CGCGGCCCA GGTCCGC3TG CTGCCG7ATC CAGGCGTGCA TCGCGATTCC GGCGGCCACG 
CEGCAGTTAA TGCTTGGCGT EGACCCGAAC 73GGCGATCC GCEGGNGAGC 7GATCGATGA 
ICETGGCCAC 2CC3~CGA7G YGCGAGT7CC 2CGAGGAAAC GTGC7ECGAG JCCGGTAGGA 
AGC37CCC7A 3GCGGCGG7G C7GACCGGC7 27CCC7 GCEC CC7CAG7GCG GCCAGCGAGC 



information ?cr sec :e ncvio: 

i: SEQUENCE CHARACTERISTICS. 

(A; LENGTH: 91.1 base pairs 
(5 TYPE : nucleic a::: 
;C CTRANDEDNESS : single 
E 70 PC LOG Y. _;r:f?ar 

SE v'EENGE 2 E3 C?. 1 ?TT GN ■ ' ■ " ^ ■ 

. . „-JE'JGCG70 J CC GATE AG 2 TGCGCATGGr 

rt .GACE 3CCT77GCC3 ECGGCACEGE 2GGTGGCGCC GGGGCCGCCG ATGCCACOC: 



60 
120 
130 
240 

300 
360 
42 0 
4GC 



54 2 



WO 99 421 IS |'CT I Mm 03265 

1-1 



.-^-CCCC 3GACCCACGG 3TCCC3CCGA TCCCCCCGTT GCC3CCGGTG CCGCCGCCA: 
TGGT3C7CC7 GAAGCCGTTA 3C3CC3GTTC C3C5GG7TCC GGCGGTOGCG CCNTGCCCGC 



CGGCCCC3CC CTTGCCGTAC AGCCACCCCC CGGTGGCGCC GTTGCCGCCA TTGCCGCCAT 
T3CCGCCGTT GCCGCCATTG CCGCCGTTCC CGCCGCCACC GCCGGNTT GG CCGCCGGCGC 



-^^juv»3GC CGC 

INFORMATION FOR SEQ ID NO : : ^ ■ 

i SEQUENCE CHARACTERISTICS: 

,A, LENGTH: 1372 oase aairs 
■:3) TYPE: nucleic acid 
■-: J TRANCED NESS : single 
. F ; TO PC LC G V : linear 

SEQUENCE DESCRIPTION : 3 EQ IF 



/JO : : - 



«^-.ulCGC CC3GACCCT. AAGGCTGGGA CAATTTCTGA 



IACCCC 3ACA3AGGAG GTTACGGGAT GAGCAATTG3 3G0CGCCGC 
. jvjTTj 3T3AGC3T33 TGGCTGCCGT ~" 



CTCAGG TG 



■* ,J ^- - -GGCCACGG 



:CC3 CCGGCCTTGT CGCAGGACCG 



rw^^.j^.. AAGTGGCC 



- ^ - - ^ 



kC ATCAACA 3 CAAACTG 33 
^TCCFAACG JTCTCGTG CT 



- • ^jw 1 an . 



--.jCGTCC3 ATAAC7 
^~30GATGG 3GAT''"1 



"8C 
840 
90 C 
913 



I 30 
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catcaactcg gccacggcga t^cggacgc gcttaacsgg catcatccgg g-tcacgtcat 

"TCGGTGAAC 7GGCAAACCA AGTCGGGCGG CACGCGTACA GGGAACGTGA CATTGGCC3A 
GGGACCCCCG GCCTGATTTG TCGCGCATAC CACCCGCCGG CCGGCCAATT GGATTGGCGC 
GAG CCGTGAT TGCCGCGTGA GCCCCCGAGT TCCGTCTCCC GTGCGCGTGC CATTGTGGAA 
CCAATGAACC AGCCAGAACA CAGCGTTGAG CACCCTCCCG TGCAGGGCAG TTACGTCGAA 
GGCGGTGTGG TCGAGCATCC GGATGCGAAG GACTTCGGCA CCGCCGCCGC CCTCCCCGCC 



GATCCGACCT GGTTTAAGCA CGCCGTCTTC TACGAGGTGC TGGTCCGGGG 
GCGAGCGCGG ACGGTTCCGN CGATCTGCGT 3GACTCATCG ATCGGCTGGA ~ACCTGCAG 



:oac 
1:40 

1260 
1320 

:3ao 

■C 1440 



— -ACT^T CTGTTGCCGC C3TTCCTAC3 



tcaccggt gcgcgacggc 



33TTACGACA 7TCGCGACTT CTACAAGGTG CTGCGCGAA7 TCGGCACCG7 C3ACGATT7C 
37CGCCC7GG TCGACACCGC TCACCGGCGA GGTATCGGCA TCA7CACCGA GCTGG7GA7G 
AATCACACCT CGGAGTCGCA CCCCWGTTT CAGGAGTCCC GCCCCGACGG AGAC3GACCG 
7AC3GTGACT A7TACG7G7G GAGCGACACG AGCGAGCGCT ACACCGACGC C COG AT CATC 
7TCC-TCGACA GCGAAGAGTC GAAP— — • 

. - tccgccgaca 

gcac cgattc tt 

: ::;forvat:o:i -or gec ;g -c.is 

: G E C r JH?;c G 7:iA£ACT E R I C T I C C 

A -^NGTK: 143: Das-TL.r ■ 
5 ■ * r'?E : nutjle : c ac;: 
C ; STRAND EDME S S : 

TCPCGCGY _nea r 

JEC'JENCG C E G CG I G 7 I ~ \* 



150C 

15 3: 

13 0C 
136C 

13^: 



GGGAA C AA C " - r G r z AAAG TCG A C A G ' 



-GGTA'. GACC AC7GCC3A7: 



WO 99/421 1 S 



I ft 



GCGCCAAGAG TGGAAGGGCG GCGACCGTGT GGA~- GG~A — ar-v*™~ 

^..bC.^A ^ACGGCTCG CACCACCTCG 

™ G CATC3AC — GGATTCAGCT CACGCAGTCG AAATGGAACG 

AACCCGTCAA CGTCGACTAG GCCGAAGTTG CGTCGACGCG TTGCTCGAAA CGCCCTTGTG 
AACGGTGTCA ACGCCACCCG AAAACTGACC CCCTGACGGC ATCTG AAA AT TGACCCCCTA 
GACC3GGCGG TTGGTGGTTA rrcrTCGGTG GTTCCGGCTG GTGGGACGCG GCCGAGGTCG 
CGGTCTTTGA GCCGGTACCT GTCGCCTTTG AGCGCGACGA CTTCAGCATG CTGGACGAGG 
GGGTCGA7CA TGGCGCCACC AAC3ACGTCG TCCCCCCCGA AAACCCGCG CCACGGGCCG 900 

AAGGCCTTAT 7GGAC37GAC GATTiAr—— 

'"" Vn -' u ^CCGCTGAT ACCGGGAGGA CACCAGCTGn 

AA G AAG AG G 7 CGGCGGCCTC 3GGC"~A 



-G CGAATGTAAC GGACTTGGTC AACGACCAGG 
AGCGGATACC GCCCAAACCG GGGGAGTTCG GCCTAGATGC GGCGGGCGTG G 
^ACCGTG CTACCCATTC GGCCGCGGTG G 



:caag gacttgggaa cggggaag: 



ATGGGG 2C~"T CACCACCATG GGACTCCC JG GCTGACAC 



^formation fc? - ; } 
: GEQ T JENC3 gharactk»;3t:cg. 



d4C 
60C 
660 
720 
730 
340 



108C 

'cgatgac2 gg gctgacac 1140 

;ggggtatgg ggaggccgac ggcaagatga gt — — 

— — -ov— aggcgg ggcggaaaaa 

— ^^.^ tggcgggggg tgatgaaatg gagggtgggc agatgtgcga 

TTCAGCCCA CGAGCA7GC7 GAAAG— — — 



12 0 0 

12 6 0 
132: 



CCCGCTGGAG 13 9C 

--.^.o ^ . C7GGGCG CGGGCCGGAT CGGGGAG G" 



WO <W 421 18 
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3 G AGAA CTTC GATCCCGAGG GCGTCCTGGG GGGTATCTAC C3NTATCACG CGGCCACCGA 
SCAACSCACr AACAAGGNGC AGATCCTCCC 3TCC3GGGTA GCGATGCCCC CGGCGCTGCG 
3GCAGCACAG ATGCTGGCC3 CGGAGTGGCA TGTCGCC3CC GACGTGTGGT CGGTGACCAG 
TTGGGGC3AG CTAAACCGCG ACGGGGTGGT CATCGACACC GAGAAGCTCC GCCACCCCGA 
TCGGCCG3CG GGCGTGCGCT ACGTGACGAG ACCGCTGGAG AATGCTCGGG GCCCGGTGAT 
CGCGGTCTCG GACTGGATGG GGGCGGTCCG CGAGCAGATC CGACCGTGGG TGCCGGGCAC 
ATACCTCAC3 TTGGGCACCG AC3GGTTCGG TTTTTCCGAC ACTCGGCCCG GCGGTCGTCG 
TTACTTCAAC ACGGACGCCG AATCCGAGGT 7GGTCGCGCT TTTGGGAGGG GTTGGCK5GG 
CCGACGGGTG AATATC3ACC rAT^""",r;T^." ^-^^^.^^.^^ 



360 
420 
48C 
540 

600 

660 
720 
78 0 

AvJl:ag::ggg d4j0 



^ ~ - - ^rtw ^r/-w *. o^TGGGGGGT ~" G C C C G C S A ^ ^AAG""""" 

: l^EERMATICN FOR SEC IT \ T C : 2 0 ■ 

l SEQUENCE CHARACTERISTICS: 

LENGTH : 102 1 sase pair.- 
TYPE: p.ucis:: acid 
STRANDEDNESS . 3inal~ 
'D> TOPOLOGY: linear 

x: SEQUENCE DESCRIPTION : SEC TC NO : 2 0 : 

.^.-^ . TCCCCACCA jAGACAAAAT VECAGOCSTT AATCCAGGA; 

- - " ~ ' --"^--^j^ -^Akj^^rw I2GA ' j G AA C 2 AAA 1 . 
rt^.j^ . o^.. .^.^^ .-^. v . ^ACCGCGAC: "TCGT'JTCGA AATTCCCCCC 



WO W4211N KT lS l > ( > 03265 

IS 



^^TTGGCC CGACCGCCGT GCCCGCACTG CTGGTCAGGT ATCOGGGGGT rTTGGC3.AGC 34C 
AACAACGTCG GCAGGACGGG TCGAGCCCGC -GGATCCGCA GACGGOOGOG GCGAAAACGA 90 C 
EATCAACACC GCACGGGATC GATCTGCGGA GGGGGGTGCG GGAATACCGA ACCGGTGTAG 960 



^CSCCIGC AGTTG7TTTT CCACCAOCCA AGCGTTTTCG GGTCATCGGN GGCNNTTAAG 



:; INFORMATION FOR 3EQ ID NO : 2 1 : 

i. SEQUENCE CHARACTERISTICS : 

(Aj LENGTH: 321 oaae pl irs 
nucleic acid 



■ B) TYPE: 

3TZANDEDUESS : smalt 
linear 



rOPCLCGY 



3EC r JENCE DESCRIPTION: GE^ 

Aja\_GGAAGAA CACAACCATG AAG^TG^'"" * 



iw^^v.^j^^^ CTGCAACCGG 
I-3GTCGTAT A C GAG A TG CA 
- - -> - — ■ 7 ANGTCCOGAC 



^^o^GCG GCCGCTGTGA 2 



" vrtvJ -3GACCAGi;C TGCTCAACAG ::CTGGi;CGAT 



— GAAGGGNAGT ETGCTCGAGG GNGGNATGGG \'GG?J, 



1020 
1021 





WO 42118 



PC T l'S l ><> O32fo 



CTTACCATCG CCG 

2' INFORMATION FOR SEC 10 NO: 2: 



3^3 



;iJ SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 352 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

IXi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
3TGAC3CC0T 3ATCCCATTC CTCGGC0GGO 2CGGTCCGCT GCCGGTGGTG GATCACCAAC 
TGGTTACCCG GGTGCCGCAA GGCTGGT^T TTGCTCAGGC AGCGGCTGTG CCGGTGGTGT 

- ^_CGAT^ TAGC CG AGA7 wWUJUCGGGC GAATCGGTGC 
""GATCCATGC 2GGTACCCGC JGTGTGGGCA "GGCGGCTGT GCAGCTGGCT GGGGAGTGGG 
GCGTGGAGGT TTTCGTCACC GCCAGCCCTG G M AA G T GGGA CACGCTGCGC GCCATNGNGT 
"TGACGACGA NCCATATCGG NGATTCCCNC ACATNC3AAG TTCCGANGGA GA 
2 INFORMATION FOR SEQ 10 NO: 24: 

: SEQUENCE CHARACTERISTICS. 

-A LENGTH: ^26 Dase pairs 
1 3'' TYPE: nucleic acid 

G' STRANEELNESS : finale 

0 TOPOLOGY- linear 

xi SEQUENCE OES II-Ul-'TION : SEC — : 2 " 



60 
12C 
130 
- 4 J 
300 
352 



WO W 42118 



20 



per i s i ) ( ) (1^265 



GGCGACAGCG CCTCCACCAT CGACATCGAC AAGGTTGTTA CCCGCACACC IGTTCGCCGG 
ATCGTG 

INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 580 base pairs 
CB) TYPE: nucleic acid 
(G: 3TRANDEDNESS: single 
■ D) TOPOLOGY .: linear 

xi SEQUENCE DESCRIPTION: SEC ID 110:25: 

TGC3ACGACG ACGAACGTGG GGCGGACGAC GGCCTATGCG TTGATGCAGG SGACCOCGAT 

■GTCGCCGAC IATAT G GAAG 2ATCCTGGGT 3CCCACTGAG GGACGTTTTG ACCAGCCGGG 




" T r ' AAl ' r "~ i - " CGCCGGGGCT TGTGCAGGTG ATGAAGGGGA 13c 
A 7 A 3 G G AA C A ATAGGGGGGT GATTTGGCAG TTCAATGTCG GGTATGGGT5 GAAA— CAA~ -4 - 

GGGGGGGGAT 3CTCGGCGCC GACCAGGCTC GGGGAGGGGG GCCAGCCGGA ATGTGGAGGG 3 0C 

AGGAGTGAAT GGCGGCGATG AAGCGCGGGA GGGGCGAGGG TGGTTTGGAA GGAACTAAGG 
ACGGGCGCGG CATTGTGAT3 CGAGTACCAC 77GAGGGTGG GGGTCGGGTG GTGGTGGAGG 
TGACACCCGA G GAAG GGGGG jCACTGGGTG AGGAAGTGAA AGGGGTTAGT AGCTAAGACC 

~ ::;f:rmat::n ?gr sec :; , -.> 0 
- sequence characteristics: 

.n .■ _! cMG TH : I f> C r a s e oa:r.; 
2 TV?E - , r: ~ 

- -TRAN^EENESG ; : — - 



360 

•13: 



\\() 04 42! IS 



PC I I S9<> 03265 



TYPE, nucleic acid 
(C; 3TRANDEDNESS : single 
(D) TOPOLOGY, linear 

,xi< SEQUENCE DESCRIPTION: SEQ ID NO:2*7 : 

GACACCGATA CGATGGTGAT GTACGCCA^C GTTGTCGACA CGCTCGAGGC GTTCACGATC 

CAGCGGACAC CCGACGGCGT GACCATCGGC GATGCGGCCC CGTTCGCGGA GGCGGCTGCC 

AAGGC3ATGG GAATCGACAA GCTGCGGGTA ATTCATACCG GAATGGACCC CGTCGTCGCT 

GAAC3CGAAC AG T GGG A CG A CGGCAACAAO v~~~^'-— ^™ 

— - . GGCGCGCGG TGTCGTTGTC 

3CCTACGAGC GCAACGTACA GACCAACGCC OG 
.-•ECRKATT rCv. ac.^; . J :jc ; Z c 



'C-ENCE GHAHACTERIGTIC3 ■ 
m, ~.ENGTK : 31" oaae oa;: 
3 TYPE : nucleic acid 
C; STRAOTSDNSSS : single 
G 1 TOPOLOGY: linear 



SECL^CE DESCRIPTION: GEC IE ^0:23: 
l-j >.j ^CTCGGAC T/YTGTGC 



— " - -o^-GAGCTC-r 



■J ~ ^ ^ 



60 

120 
130 
240 



:. 3 o 



WO W4211K 
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2 INFORMATION FOP 5EQ IE NO: 30: 

i; SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 308 base pairs 

(B) TYPE: nucleic acid 
;C) STRANDEDI-TESS : single 
<D) TOPOLOGY: linear 

(xi, 1 SEQUENCE DESCRIPTION : SEQ ID NO : 3 0 : 

3ATGGCGAAG TTTGGTGAGC AGGTGGTC3A CGCGAAAGTC TGGGCGCCTG CGAAGCGGGT 6 0 

rGGCGTTCAC GAGGCGAAGA CAC3CCTGTC CGAGCTGCTG CGGCTCGTCT ACGGCGGGCA 12 C 

^, (7^"~ T * /r !AviA 'y^^ C 0 G G C G G CGGCGAGCCG GTAGCAAAGC TTGTGCCGCT GCATCCTCAT 13C 

-j.-^wr-^ — Cjj- j^i. . AGGCA. - G^ ICATGGC GTGTAGC3CG TGCCCGACGA TTTGGACGCT 140 



:rhation for seq id nc:31: 

SEQUENCE CKARACTEF. 1ST I CS : 
A; LENGTH : 16" case oa: 
'3 TYPE: r/icle;: 
,1' STRANDE2NE3S : smole 
L ; TOPCLCGV : I .near 

lE^UENCE GE3 CP I .-I". JL . SEi, 



. jvj ^ „ O'J - 



\~ -J • JrtVJJUU 




WO 99 421 IS 
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G G GAG AG GAT GGGCGCGGTG GACTGGTTCG AAGTACAGTG AATTGGAGGG GACGTGGTGG ISC 

ACGGAGCGGT CGCCCACTTC CAGCTGAGTA TGAAAGTCGC GTTGGGGTGG AGGATTCGTG 2 4 0 

AAGGTTGAAC GGGGGGCGAT AACTGAGGTG CATCATTAAG GGACTTTTCG AGAACATCGT 3 CO 

GAZGCGC1C3 AAACGCGGTT CAGCCGAGGG TGGCTCCGCC GAGGCGCTGG CTCCAAAATC 36 0 

GGTGGGAGAA TTCGTGGGCG GCGCGTAGAA GGAAGTCGGT GGTGAATTGG TCGGGTATGT 42 C 

GGTGGAG-TG TGT-GGGCTGG AGCGGGAGGA AGGGGTGCTC GACGTCGGGT GCGGCTCGGG 4 8;" 



,3uA. j rTG — —G 1 ' — GA CCju _ » A T'_ T '^iAAGAGCGALj LiGACGGTA'_G C GGG(J'I V I' _ oA 
VTGTGGGAG AAAGCGATCG GGTGGTGTGA : GGAGGACATG ACGTCGGGGC A C G C G AA CG Y 



jHL."i ^V-^,rt — ~\J-S. ._ j. ^ JrtKAvJ'J Jrt .-\J~\ 



---a ^ AT .wr. . ... jTCG - GGAG ^jGGG "GG'TTTG 



GAAGGGGGGG GGAGGATGGG TGTGGACGTA GTTCTTGGTG AATGAGGAGT CTGTTAGCGGA S4G 

'"AGGG~GGAA GGAAAGAGTG GGGAGAAGTG GGAGGAGGAG GGAGGGGGTT ATGGGACAAG 9C3 

J Z A Z .-u-*. '-j o ju^wv, jnnG AAGGAATIG3 G'G "G G C GG AG A G G G" T G G T G A GGG ATGG GTA 

^ ^ _ .-v>rvv_i ^ . w .jo. _ o ^ . ■ ^ o orv^A. .-v „ . jv_v-\ _ ^ rt'^ jvj -T^-v^uh j i ju .wJ'jun ^ ^ *. 

- ; — — o — — -"vo o — \ _ ~ -j - -.^ . -o^ on. o^^rtu — nwu. _.fv . . 
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:D! TOPOLOGY . linear 
x: SEQUENCE DESCRIPTION: GEO. 10 SO: 22: 

GTGGAGGGTG GCGTGGATGA GCG7CACCGC OGGCCACGCC GACC70ACC0 GGGCCCAGGT 6 0 

CCGGGTTGCT QCGG^QGCZT ACGAGACGGC GTATGGCCTG ACGGTGCGCO CGCCGGTGAT 12 0 

GGCCGAGAAC CGTGGTGAAC TGATGATTCT GATAGCGACC AACCTCTTGG GGCAAAACAC ISO 

OGCGGCGATG GCGGTCAAGG AGGCCGAATA OGGCGAGATG TGGGCCCAA G ACGCCGCOGC 24 Z 

GATGTTTGGC TACGCGGCGG CGACGGCGAC GGCGACGGCG ACGTTGCTGG J3TTCGAGGA 3 0 :, 

0 3C3CC0GAG ATGAGGAGGG GGGGTGGGGT OGTGGAGGAG GCGGGGGCGG TOGAGGAG 00 2 6 0 

3 0 T GAG 0 0 0 ACGGAGGGGA 0 OA G G 0 GOT 0 OI'OOAAGGTG 3GTGGGGTGT jGA^jAC j O -th.j 

GAG ZAACTCG GGTGTGTGGA TGACGAAGAC OTTOAGCTOG ATGTTGAAG 1 GOTTTGOTOG «0C 

J3GGGCGG0G GGGGAGGGGG TOOAAACror 1 "GOGCAAAAG GGGGTOOGGG ^.j AT GAG _ * 0 *5 6 „ 

3 G . . ^jG T . J ^ A> . .-V ^ G ..jT G r.J^;r\ . ~ aOA-Arlj^ TA ^ v_; CANf-Vo - _ - - ^- ^J'— ~ — - 



;hmat:on pop. sec :o 
oecuenoz tiahagteristioo . 

A GGNOTii : J \ ■> :as^ . i : 
E "VPE r.u::-: • »■:;: 

"" T3 \'.'0 EGI-'GG/ " : ■ 

^ . G G T „ GO ^O^-T^T "!!)*.' GO , 



. jGrv 0 OAG.-a 
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.1 SEQUENCE CHARACTERISTICS. 

(A) LENGTH: 1227 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(>;:: SEQUENCE DESCRIPTION: SEQ ID NC : 3 5 : 



GATCCTGACC GAAGCGGCC3 CCGCCAAGGC GAAGTCGCTG TTGGACCAGG AGGGACGGGA 5 0 

CGATCTGGCG CTGCGGATC3 CGGTTCAGCC GGGGGGGTGC GCTGGATTGC GCTATAACCT 12 0 

TTTCTTCSAJ GAC C _"_jACG ^GGATGGTGA GCAAACCGCG GAGTTC3GTG 3TGTGAGGTT 18 'J 

■j*-^ - — * - -'^rt'-. - ^jv^.-l ^ taaLj^. • ,i. .^jirti J. juAm-joC - G SAT jA IT _ CGTCGACAC . " 4 0 

_.-iT. i.'aj.iAu .A^^j^. - /-i rC. j Al_AA\ r CCAAG G C 3A , G G3GT GGGG 3GCGTG33GG 3 0 0 

GATT 7\; - TC.-a Ai_ ^ - A.AA.n _ GGTAGTAGG ACCCCGCGGT 3C 3 CAAGA 3G TAG GAG "AG A 3 6 0 

GGClGG^GGG Gvj^ — Asjvj^GTG ACGTGGATGG TGAAGAGGAG CTGGGC 7TGA TATTGCGACC 4 8 r 

AGTACACGAT TTTGTGGATG GAGGTCACTT 3GAGCTGGGA GAACTG STTG CGGAACGCGT 54 0 

GGGTGCT3AG CTTGG CCAAG 3 G 3TG AT CGG AGCGGTTGTC 3CGCAC3CC3 TCGTGGATAC 6C0 



-CTGAA.CGC GG ^ ^ „ GGCG GGTCCCTC33 AGAATGTGGG TGCCGTGTTG 3CTCGGTTGG ^Z': 



J'J . . „ JU 
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i3' TYPE: nucleic acid 
"C 1 STRANDEDNTCSS : single 
iD> TOPOLOGY: linear 

( x: 5 EQUENCE DESCRIPTION: 3EQ ID NO: 36; 

GCGGTGTCGG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGGGCCGGC GGGGCCGGCG 60 

GGACCGGCGC TAACGGTGGT GCCGJCGGCA ACGCCTGGTT GTTCGGGGCC GGCGCGTCCG 1X0 

GCGGUGCCGG CACCAATGGT GGNGTCGGCG GGTCCGGCCG ATTTGTCTAC GGCAACGGCG 130 

: 3 : 

: INFORMATION 7CR 3EQ 12 :;C : 3" ■ 

3EO^ENCE JiiAKACl^KIoi';^: 
A LENGTH: I9C case :a:rs 
3' TYPE; nucleic acia 
3 3TRANDEDNESS : Gir.g.e 

;3 TOPOLOGY; linear 

:>;; SEQUENCE DESCRIPTION: 3EQ ID NO : 3 7 : 
3GG3TGTGGG EGGATCCGGC 'GGGTGGTTGA ACGGCAACGG 33GTOT3GG3 GGCCGGGGC3 60 

L3GGCGGGTJ 3A3CGGCGGC AACGGOGGT" TTGGCGGOGG GGGCGGTGGC GGAGGCAAC 3 13C 
J/J J3A333 :GGCTTC3GT "AGCAACGGCG T-TAAGGGTGG 3GA3GGCGGN ATTGG333 3 3 2-\C. 



;:;F:?MA~:3r: -dp 3E3 :z :;c : z a ■ 

"~ EC TEN 03 CHARACTER. 1 3 T 1 3 3 
A LENGTH: : 4 Ldbe : i : 



"3 ANTED!* L^; . ■ : . 

■.3t33 3atggngggt 3t3agt3ga; 
:e:rvat:on ecr 3ec :d no:3. 



LENGTH. 13^ 
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ICOCACCGCT CCCACCGTTA OCGAACAAGC 



GATCGCTGCT CGTCCrCCCC TTGCGGCCGA 

T3GCGTGGTC GCGAGCACCG CCGGCAGCGC CGACGCCGGA GTCGAACAAT GGCACCGTCC 

TATCCCCACG ATTGCOGCCG GNCCCACCGG CACCG 

(2; INFORMATION FOR SEQ ID NO : 4 0 : 

s'i) SEQUENCE CHARACTERISTICS: 
i A) LENGTH : 5 3 base pairs 
(B) TYPE : nucleic acid 
C 1 STRAND ED NESS : single 
ID) TOPOLOGY: linear 



155 



SEQUENCE DESC?.:?'! , ^N: 



INFORMATION FOR 3ZQ ID NC : 4 I : 

:i SEQUENCE C:-LARACTERISTICS : 

LENGTH: 132 base pairs 
TYPE: nucleic acid 
STRANDEDNESG : smqie 
TOPOLOGY: linear 

aEQUENCE OESCRTPTION : GE" 1 I 



3NGG jgcg: 



: 41 : 



-■^jNGvi "SGCGGGCGGG "!OO0GCGO! 
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(B) TYPE: nucleic acid 
(CJ STRAND EE NT SS : single 
(D) TOPOLOGY: linear 

-.XI' SEQUENCE DESCRIPTION: SEQ ID NC 4J: 

CGGCACGAGG ATCGGTACCC CGCGGCATCG GCAGCTGCCG ATTCGCCGGG TTTCCCCACC 

CGAGGAAAGC CGCTACCAGA TGGCGCTGCC GAAGTAGGGC GXZCCGTTCG CGATGCGGGC 

ATGAACGGGC GG CAT CAAAT TAGTGCAGGA ACCTTTCAGT TTAG CGACGA TAATGGCTAT 

AGCACTAACG AGGATGATCC GATATGACGC AGTCGCAGAC CGTGACGGTG GATCAGCAAG 

AGATTTTGAA CAGGGCCAAC GAGGTGGAGG CCCCGATGGC GGACCCACCG ACTGATGTGC 

I GAT TAG A ZZ GTGGGAACTC ACGGNGGNTA AAAACGCCGG Z CAAC AGNTG GTNTTGT CTG 

CGCTGCGCAA CGGGGGGAAG 0NGTATGGC3 AGGTTGATGA GGAGGCTGCG ACGGGGGTGG 

ACAACGACGG GGAAGGAACT GTGCACGCAG AATCGG CC3G GGCGGTGGGA GGGGACAGT"T 

GOG C G G AA G T AAC G GATAGG CCGAGCGTGG CCACGGCC3G T G AA C G GAAC TTCATGGATO 

TCAAAGAAGC GG GAAGGAAG GTCGAAACGG GCGACGAAGG GGGATGGGTG GCGCACTGNG 

" ::r':?>:at::;n for se; ;g ;•; 
;ecue:jgz :haragtgh ; jt:gg- 

A :.E:iGT>: ■ oa^. ;^.r:; 

3 " V °E x:.e:: a:- 

3' ~-?.ANTEG::~3 3 • . •» 

> 3 TOPOLOGY : : near 



60 
120 

180 
24C 
300 
160 
4 2'J 
480 
540 
50 0 
660 
702 
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■;c : S TRAND EDNE 5 3 . single 
(D; TOPOLOGY: linear 

,xi; SEQUENCE DESCRIPTION : 5EQ ID NO : 4 5 ; 

GGGCACGAGG ATCGAATCGC GTCGCCGGGA 3CACAGCGTC GCACTGCACG AGTGGAGGAG 

CCATGACGTA CTCGCCGGGT AACCCCGGA^ ACCCGCAAGO GCAGCCCGCA GGCTCCTACG 

GAGGOGTCAG ACCCTCGTTC GCCCACGCCG ATGAGCGTGC GAGCAAGCTA CGGATGTACC 

7GAAGATCGC GGTGGCAGTG CTGGGTGTGG OTGCGTACTT CGCCAGCTTC GGCGCAATG'" 

TCACCGTCAG TACCGAACTC GGGGGGGGTG ATGGCGCAGT GTCCGGTGAC ACTGGGCTGC 




zz gggggact:g gggtcgcaac aatcggac^a — cccatgg agjggacgt; 

:: AGTCAAC7AT TCAAACC ZGA jCGGGGGGGA ".IGAGTCGTCG 7C 7CCCGGG'. 

■- v -a": .;e v :e .4, 

3 £ ^ ee? j g e chara cte rig7igg - 
r\ ^nNGTh : i r " oar. ^ T^airz 
h rv?E. r.uclei:: acid 
: J TEA^DEEIJESc ^msl- 
r ~ ' . rT E . G : linear 



6C 

i:o 

18G 

24C 
3 00 

4::; 



AGAGCCATGT GACGGTAGTT GCGGTGGTCG JGGTACTCCG GG TATTTC 

GGACGT7TAA CAAGGCCAGC GCGTAT7CGA CCGGTTGGGC ATTG7GGGTT JTC7TGGCTT 4 a 0 

TGA7GGTGTT CZAGGZGCTZ: GCGGGA3TCC TGGCGCTGTT GGTGGAGACC GGCGGTA7CA 5 4 C 

GG3GGGGGGC GCGGGGGGOC AAGT^GGAGG CGTATGGACA GTACGGGGGG TACGGCCAG7 5 00 

ACGGGCAGTA CGGGGTGCAG GCGGGTGGG7 A G T A C GG T GA 3CAGGGTGCT CAGCAGOCCC ~6C 

7 3GGA073CA G7CGCCCGGC CCGCAGGAG7 GTGGGCAGGG TGCCGGATAT GGGTGG T\G~ — 0 



3C 
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AGTGGC3CGG CGCGGCGGGG ACGGCCGCCC AGGCCGCGGT GGTGCGCTTC CAAGAAGCAG 180 

CCAATAAGCA GAAGCAGGAA CTCGACGAGA TCTCGACGAA TATTCGTCAG GCGGGCGTCf 24 C 

AATACTCGAG GGCCGACGAG 'GAG CAGCAGC AGGCGCTGTC CTGGCAAATG GGCT7CTGAC 3 00 
CCGCTAATAC GAAAAGAAAC GGAGCAA 
(2) INFORMATION FOR SEQ ID NO ; 4 7 : 

'■i ] SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 170 oase pairs 

(B) TYPE: nucleic acid 
:C) STRANDEDt.'ESS : single 
:.D) TOPOLOGY: Unear 

XI SEQUENCE CE5CRI PTI C?J : SEQ ID NO : ^ 

_ j - — -J '■ ; j-v * _iA r*. 

CCAACAACGT GTTGGCGTCG GC AAATGTG C CGNACCCGTG GATCTCGGTG ATCTTGTTCT 

TCTTCATCAG GAAGTGCACA CCGGCCACCS TGCCCTCGGN TACCTTTCGG 

I NFC RMAT TON' FOP SEQ I C N T 0 ■ 4 3 : 

SEQUENCE CHARACTERISTICS- 
; A LENGTH : 1 1 ~ ::a r,« pairs 
3 TYPE: nude:.: ndri 
C ■ S TRA2TO E D NE S 5 ; : i : 1 - 
C TOPOLOGY . _ - ::tvir 

SEQUENCE CEGCRIPTIS::. CEQ 'E NE ■ 

SATCCGGCGG CACGGGGGGT SCC3GCGGCA 3CACCGGTGG cnCTSGCGG- ^ACGCCGGGO 



;rvat:qn " :• . :• \- ; • 

C EQ UENCE CHAR A ITER ! ST I CS 
A LENGTH: £1 sase 
3 TYP E . nuc.e:; _i c : u 

S "PANE EE NEE 3 .;i::d*j 
■ E TCPCLOGY: ^r.car 

- r. Q u ~ ? « ' . ~ . - ^ w ^ .- . ~ ~ ! I , " S 



WO W 4: 118 
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;i SEQUENCE CHARACTERISTICS • 

■A; LENGTH: 149 base pairs 
■;B) TYPE : nucleic acid 
:C, ; STRAND ED NESS : single 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEC ID NO: 50: 
GATCAGGGCT ZGCCZGCTCC CCCCAGAAGG GCGGTAACGG AGGAGCTGCC GGATTGTTTG 6 0 

JCAACGGCGG SGCCGGNGGT GCCCCCCCGT CCAACCAAGC CGGTAACGGC GGNGCCGGCG 12 C 

3AAACGGTGG TGCCGGTGGG CTGATCTGG 
?. ^TFOHMATISN PCR 3EQ IE NO: SI: 

SEQUENCE CHARACTERISTICS: 
A. LENGTH : ISb oase ca:r3 
;B) TYPE : nucleic acid 
: C STRAND EDNESS : smqle 
■;D- TOPOLOGY- linear 

XI SEQUENCE DESCRIPTION: SEQ ID NO : 5 I : 
ACGCGGNAAT CCAGGGCGGT CTGGCCC3AG ITS CG CAGAC C\TGCGCGCC CTGGACTGGT 1 : : 

.-^O^^w. CGGCTTCCGC CT~GAGGATT ICTGAACCTT lAAGCCCCCC SGATAACTGA I : 

*ur"R.VATicN ?o? sec is n~ 

: ; essence "iaractep. : u~ ; v 
u :.:;:;s~:; . , . - 

I TVPE. .-uic.-j: i _ _ 

:tpanced?jess ; . : . . 

T C PC L CG V ^ ^ e a 1 
SEQUENCE CECCP. I PTI Sfi 3E " ::. \\~ - 
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JGGGGGGGGA ACACGCCGAA TGCCCAGCCG GGGGATCGCA ACGCAGCACG ^CZGCCGGCC 3 00 

GAGCCGAACG GACCGCCGCG ACCTGTCA7T GGGCGAAACG GACCCCAACC TGTCCGGATC 3 60 

GAGAACGCGG TTGGAGGATT GAGGTTCGCG CTGCGTGCTG GGTGGGTGGA GTCTGACGCC 4 2 0 

GCCCACTTCG ACTACGGTTC AGCACTCCTC AGCAAAACCA CCGGGGACCC GCCATTTCCC 48 0 

GGACAGGCGC CGCCGGTGGC CAATGACACC CGTATCGTGC TCGGCCGGCT AGACCAAAAG 54 0 

GTTTACGCCA GCGCGGAAGG CACGGACTCG AAGGCGGCGG CCCGGTTGGG CTCGGACATG 500 

GGTGAGTTCT ATA7GCGCTA CCCGGGCACC GGGATGAACG AG G AAA C C G T GTCGCTCGAC 66 Z 

GGGAACGGGG TGTCTGGAAG GGCGTCGTAT 7ACGAAGTGA AG TT GAG CG A 7GGGAGTAAG ^2 0 

- GGAACGGCG AGA737GGAC GGGGGTAA7~ GGC7CGCGC3 GGGGGAACGC ACCGGACGCC "3C 

GGGGCGGCCA AGGCGCTGGG GGAA7CGA7C C3G^~rT7GG TCGCCCCGCG GGCGGCGCGG 9 d C 

3GAGGGGCTC C 73 GAG AG 22 GGCTCGGGCG GGGOCGCGGG GGGGGGAAGT CGCTCC7ACC 960 
„C^-wGAGAC 3GACAGC3CA 3CGGAG3TTA. GGGGG: 

:::?GRMAr:c:; ?cp. geq :g :ic . 



1 se;^e::;ch: 3>ia?j^gtzk;gtig2 . 

A L -2NG7H : 3 3 2 are : nc ^ c 1 a n 
"2V?E. amine iCic 

- iliS H — — Met 31;: Vd 1 Asp Pre Asn 'eu 7hr 
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LOC 



L05 



Asr. Ala Pro Gin Pro Val Arg lie Asp Asn Pro Val Gly Gly Phe Ser 
115 120 :25 

?he Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala tixs Phe Asp 
130 135 140 

Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr Gly Asp Pro Pre Phe Pro 

145 150 155 160 

Gly Gin Pro Pro Pro Val Ala Asn Asp Thr Arg lie Val Leu Gly Arg 
165 170 " 



-ou Asu 



■\^a i3 er .~v^a 

1 Q s 



via Thr Asp Ser Lys Ala 



^r- ^eu 



-y 3*lu Phe Ty: Mt±r_ Pro ^vr Pro 



. a... ^er ^eu 



rtij.;; o ^ '.' Va . 



Ser Gl*. 



= er ;u3 ber Tyr 
21 £ 



Lu Val Lys Phe Ser Asp Pro Ser Lys 



~o Ai.a Ala Asr. 



Pro ,\so Ala Jlv Pro Pr 

: 6 : 



^„ ^ u 0 



:::f jrmat: j:; 



~E!JG J . a:n. ::. -i . . 
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Val Ala Ala ^eu 

INFORMATION FOR SEQ ID NO:55: 

ii) SEQUENCE CHARACTERISTICS: 

(Aj LENGTH: 15 amino acids 
(31 TYPE : amino acid 
[C STRAND ED NESS : 
>D) TOPOLOGY- linear 

SEQUENCE DESCRIPTION: SEQ ID NO: 55 



:rmat:on for sec id :;c-s6 

5 e quence ckaracte r 1 s t i cs : 

'B. TYPE: amine acid 
;C l STRAND ED NESS . 
D TOPOLOGY linear 

SEQUENCE DESCRIPTION. SEC ID NC : 
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!xi) SEQUENCE DESCRIPTION: SEC IE NO: 53: 

Asp He Gly Ser Giu Ser Thr Glu Asp Gin Gin Xaa Ala Vai 

INFORMATION FOR SEQ ID NO: 59: 

(:) SEQUENCE CHARACTERISTICS: 

(Aj LENGTH: 13 amino acids 

(B) TYPE : amino acid 

(C) STRAND ED NESS : 

fD> TOPOLOGY : linear 

XI; SEQUENCE DESCRIPTION: SEQ I~ NO: 59: 

A-a GIj Glu Ser ' e Sp*- y aa -T.e '/a ^ ?rr 

information for seq id no:-sc 

. i' SEQUENCE CHARACTERISTICS : 

( A ; L ENGTi I : 17 amino a ; d s 

(S 1 TYPE : arr.ino acid 

iC STRAND ED NESS ; 

'.D TOPOLOGY: linear 



Pro Pre 
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\xxi SEQUENCE DESCRIPTION: 3 EC CO NO: 62: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Gin Thr 3e: 
1 5 10 15 

Leu Leu Asn Asn Leu Ala Asp Pro Asp Val Ser Phe Ala Asp 
20 25 30 

INFORMATION FOR SEQ ID NO: 63: 

i; SEQUENCE CHARACTERISTICS: 

(A, LENGTH: 2 4 ammo acids 
iB) TYPE: ammo acid 
\C) 5TRANDEDNESS : 
■;D; TOPOLOGY: linear 

xi 3EQGENCE DESCRIPTION: SEC ID :;0:b3: 



j - y w / 5 j ^ y .-v^ o 



n. rg S e i" G * y G I y Ay L e i: 



INFORMATION "OR 3 EC 



NO : 54 : 



aEC/JENCE CHARACTERISTICS : 
(A: LENGTH: 13" arr.mc acius 
3 TYPE: a m mo acid 
C: 3 TRANCED NEC 3 : jmgl» 
D -0PCLCGY. linear 



Mo-- - v r 



. . . . . -V ^ ;1 



-"a I '/a. Phe Glv A^a Prc 



-a. ;'r- Tnr A. a a J.n 
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Tie Ala Asp His Lys Leu Lys Lys A^a A^a Glu His Glv Asp Leu Pro 
115 120 125 

Leu Ger ?he Ser Val Thr Asr. lie Gin Pro Ala Ala Ala Glv Ser Ala 
130 135 140 

Thr Ala Asp Val Scr Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr 
145 150 155 160 

Gin Asn Val Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala 

165 170 175 

Ser Ala Met Glu Leu Leu Gin Ala Ala Glv Xaa 
130 135 

INFORMATION FOR 3EQ ID MO: 65: 

i SEQUENCE CHARACTERISTICS. 

; A : LENGTH : 14" ammo ic;as 

;3; TYPE: uruno acid 

[ G ■ 3 TRA! *D ED NE S3: 3 ingle 

['J- TOPOLOGY: linear 

:xi.- SEQUENCE DESCRIPTION: SEQ ID IJO : 55 : 

Asp Glu Val Thr Val G.u Thr Thr Ser Val Phe Arg Ala Asn Phe Leu 

- 5 :5 

S-r 0.;: Leu As; Ala A. a G^n A. a Gly Thr Glu Ger Ala Val Ser 

2S 30 

* ; : '* ' ~^ ei: : ' r - Ser A^a Leu Leu Val Val Lvs Ara 

: 

;J> Asr ' 1 1 -" ~ er — -eu i.e.; Auc Gin A. a : . e Thr Ser 

5: 

A. a Glv Arc Hls 'J- Ger ^sr; ; le p ne Leu Asr Acr '.'a; Thr ". r al 
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INFORMATION FOR SEQ ID HO : 66 : 

(i) SEQUENCE 'CHARACTERISTICS : 

>A; LENGTH. 230 amino acids 
(3) TYPE: ammo acid 
iC: STRAND EDNESS : single 
(D) TOPOLOGY: linear 

(Xl ; SEQUENCE DESCHirriON: SEQ IL NO: 66: 

Thr Ser Asn Arg Pro Ala Arg Arg Sly Arg Arg Ala Pro Arg Asv Thr 

1 5 10 is" 

Gly Pro Asp Arg Ser Ala Ser Leu Ser Leu Val Arg Hm Arg Arg Gin 
~ 0 — 3 0 



*rg Asp A. a Leu Cys Leu oer 



oer .hr 3 ^ r. He Ser Arg Gin Ser 



As:i ^ ?rc? P - Ala ^ Glv -la Ala Asn Tvr Ser Arc Arc. Asr 

50 5S 6C 



Pne Asp Va. Arg lie Lys lie Phe Me 

BO 



4et: Leu Val Thr A. a Val Val Leu 

^ ^ 7/1 



-ys ?/s Ser GLv Val Ala Thr A. a Ala Pro Lys Thr Tyr Cvs 



9o 



Jl'a 

95 



A:,., .n: _^y j_r. A^a Cys 3 In lie Gin Met. Se: 
-■^ 110 



o e i ^eu yrc ^er v 1 " _rv "->- 



As;: Tvr 



. ■ i ^ a .-^ j. 



ilu Ala Tyr Glu Leu Asr 



■ w.i .... j y ... . r. m , -j . yr .^rg ^vc Pre ; 
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INFORMATION FOR SEC 10 NC:67 : 

C J- ■■ SEQUENCE CHARACTERISTICS : 

(A; LHNGTH: 132 amino acids 
(3) TYPE: amino acid 
:C) 3TRANDEDNESS : single 
(D.) TOPOLOGY: linear 

(xi ; SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Thr Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Glv Phe 
1 5 10 15 ' 

A^a I^e Pro lie Gly Gin Ala Met Aid. lie Ala Glv Gl;: :1» Am Ser 
"0 25 30 

lis ^le Gly ?r: Thr Ala Phe Pen Glv 
3 5 40 45 

- eu G1 V Val 7al A5 P Asn Gly Asn Gly Ala Arg Val Gin Arp Yal 

Val Gly Ser Ala Pro Ala Ala Ser Leu Gly lie Ser Thr Gly Asp Val 

7 G 75 " SO 

T.e Tnr Ala Yal Asp Gly A_a ?rz lie Asn Ser A. a Thr Ala Met Ala 
Asp Ala L-; Asn Gly H;g H;s Pre G.v Ago Yal lie Scr Yal Asn Trr 



Gly Pro Al a 

INFORMATION FOR SF; TL \ T G - 8 

A LENGTH : : : " ir : i ■ ; - 

I GTRAN3EDNE3G ; . . • • 
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Mcl Ala Arg 7a 1 Arg Arg Arg Ala lie Trp Arg Sly Pro Ala Thr Xaa 
50 55 5G 

Ser Ala 31*/ Met Ala Arg 7a 1 Arg Arg Trp Xaa Val Mer Pro Xaa 7a! 
65 70 75 30 

lie Glr. Ser Thr Xaa lie Arg Xaa Xaa Gly Pro Phe Asp Asn Arg Gly 
35 go 95 

Ser 31 u Arg Lys 

iOO 

INFORMATION FOR 3EQ ID NO : 6 3 . 

i; sequence characteristics : 

LENGTH: 16 3 ammo aciao 
'3, 1 TYPE: ammo acid 

STRAiVDEDNESS : single 
TCPCLCGY: linear 



SEQUENCE DESCRIPTION: SE; 



Met Thr Aso Asp lie leu Leu lie Asp Thr Asp 31 u Arg 7a 1 Arg Thr 

Leu 'hr Leu Asr. Arg Pro Jlr. Ser Arg Aon Ala L-u Ser Ala A. a Leu 

25 30 

Arg Asp Arg ^h? Phe Ala Xaa Le- Xaa Asc A.a COu Xaa Aod Aso Asc 

- :5 



- dU : L-' S A. a Sly Arg A- a A:;o Arg A._i a g :j 

~ 5 '5 3C 



Ar.r Ar 



WO 09 41} IS 



11 
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INFORMATION FOR SEQ ID NO: 70: 

: SEQUENCE CHARACTERISTICS : 

(A; LENGTH: 344 ammo acids 
(B) TYPE: amine acid 
(CI STRAND EDNESS : single 
(D) TOPOLOGY: linear 

!x:! SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Met Lys Phe Val Asn His lie Glu Pro Vai Ala Pro Arg Arg Ala Gly 

5 10 15 

Z^y Ala 7a* Ala Glu Val Tyr Ala :3lu Ala Arg Arg Glu Phe Gly Arg 
-° 25 10 

^eu Pro 31 'a Pro le^ Ala Met Leu Ger Pro Agd 31u Glv _eu ^eu Thr 
35 40 4 - 



,1a Gly Trp Ala Thr Leu Arg Glu Thr Leu Leu Val Glv G 



In Val Pre 



5L 



""i GiV Arc? Lv = 



^ 6 : 



O i *a .A. 



.a Va_ A* a Ala Ala Val Ala Ala Ser Leu Arg 

° 5 "0 ^5 go 

"ys Pre Trp Cys Val Asp A-u !us Tnr Thr Met Leu Tyr Ala A. a Glv 

Bb jo 95 

> . r. .rvr -sp Thr A^a A.u I^e Leu Ala 3 Iv Thr Ala Pro Ala Ala 



?r " ' J — ^ h -e llv Pro Asd 7a.. Ala Ala 



wo 421 18 



42 
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:hr Arg Gin Val Val Arg Arg Val Val Gly Ser Trp His Gl 



245 



y Gl u Pro 



250 2S5 



Met: Pro Met Ser Ser Arg Trp Thr Asn Glu His Thr Ala Glu Leu Pro 

260 265 270 

Ala Asp Leu His Ala Pro Thr Arg Leu Ala Leu Leu Thr Gly Leu Ala 



2^5 280 



285 



Pro His Gin val Thr Asp Asp Asp Val Ala Ala Ala Arg Ser Leu Leu 

290 295 300 

Asp -Thr Asp Ala Ala Leu Val Gly Ala Leu Ala Trr: Ala Ala o ne Th- 

105 3:0 315 320 



rrp llo Gly Ala Ala Ala Glu Gly Gln 
330 -is 



^ l d Aru ^rg l^e Glv Thr " 

Val Ser Arg Gl n Asn Pro Thr C 
3 4 C 

INFORMATION FOR SEQ ID NO : 7 1 : 

SEQUENCE GHARACTER 1 ST 1 CS : 
A, LENGTH: 43E dm — c a: 
:3! TYPE, arr.ino d:::c 
;C) STRAND EDNESS : single 
-.D.. TOPOLOGY: linear 

:-:i SEQUENCE EESCEl-pTTOri: set. 

.-.:c Asc Pro Asc Met P- V - ~ 



.-^a . _ , , . . AG p _ e Jvs /a , 



\ ^ 1 1 m i a . a _ 



WO W 42 ns 
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Thr Leu Leu Arg Aan Leu Glu Phe Leu Pro Asn S'er ~h - ' eu 

130 ^35 X4C 



Asn Ser Gly Thr Asp Leu Gly Leu Leu Ala Gly Cys Phe Val Leu Pro 

160 



145 *50 155 



lie Glu Asp Ser Leu Gin Ser lie Phe 



Ala Thr Leu Gly Gin Ala Ala 
165 170 



Ala Phe Ser His Leu 



Glu Leu Gin Arg Ala Gly Glv civ Thr Glv Tvr 

Arg Pro Ala Gly As? Arg 7a. Ala Ser Thr Glv Glv Thr Ala 3e - G- 

' ' 205 ~ " 

" r ° ; a : 3er Phc LeV] -eu Tyr Asd Ser Ala Ala Gly Val val ser 



lib 



Met Gly Gly Arg Arg .Arg Gly Ala Cvg 
225 230 * ' ; 35 



Met Ala Val Le'j As:: 7a .1 Ser 

24 0 



His Pro As? He Cys Asp Phe Val Thr Ala Lys Ala 



245 25c 



Glu Ser Pro Ser 

255 



Glu Leu Pre His ?he Asn Leu Ser Val Glv Val 



:60 



Thr Asp Ala Phe Leu 



16 5 



Arg Ala Val Glu Arg Asn Gl> Leu His Arg Leu Val Asn Pro ^ T- - 

285 

J< '' Val Ald Ar? : ' e! Ala -i- Leu Phe Asp \ • ^ 

30C ' ' ' 

''I - ys A - 5 A ^ HlJ - As? .;: , Leu 7a. Ph-, Leu Asc 



^ AS - Alj ?rc Va: P- GLv Ar , ^ c - v 



35- 



^ A r ; Mr- L--. a.., asc Jly A — 7u. Asp 



- -eu Glu Glu Val A... 3 1 v V,; 



i Va. Arg Phe Lev: Asr 
• 2 _ 



■ 5 - o'! r Ar r 



WO 99-42118 
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Leu Leu Ala Ala Leu Gly lie Pro Tyr Asp 5er Glu Glu Ala Val Arg 
420 425 430 

Leu Ala Thr Arg Leu Met Arg Arg lie Gin Gin Ala Ala Kis Thr Ala 
435 440 445 

Ser Arg Arg Leu Ala Glu Glu Arg Gly Ala Phe Pro Ala Phe Thr Asp 
450 455 460 

Ser Arg Phe Ala Arg Ser Gly Pro Arg Arg Asn Ala Gin Val Thr Ser 
465 470 475 480 

Val Ala ?rc Thr Gly 
485 

INFORMATION FOR SEQ ID NO : 7 2 : 

■A; LENGTH : 2 67 amine acids 

(3) TYPE: amino acid 

Id S TRANDEDNESS : single 

(d; TOPOLOGY: linear 

,xi: SEQUENCE DESCRIPTION: 5EQ 10 NO: 72: 

Gly Val lie Val Leu Asp Leu Glu Arg Gly Pre Leu Pro Thr Glu 

5 10 15 

lie Tyr Trp Arg Arg .Arg Gly Leu Ala Leu Gly lie Ala Val Val Val 
20 21 30 

Val jly lie Aid "-"a. A -a I1*j "a. I 1 e A- a ?ne Val Ase Her oer Ala 



A*_i /-.ar :-r: A~ a ^cr A', a 31:: ler His 

Gin Ala Pro Gin Pro Ala Gly Gin Thr Glu 

3C 



ji." -nr ueu A* a 



:r Me-. Va . '.' i : T u .r Asn 



WO 9 'J 4:i IS KTVSW.-03265 

4> 



Ala Tyr Val Tyr Ser Leu Asp Asn Lys Arg Leu Trp Ser Asr: Leu Asp 
155 1 7 C 175 

Cys Ala Pro Ser Asn Glu Thr Leu Val Lys Thr Phe Ser Pro Gly GIu 
180 185 190 

Gin Val Thr Thr Ala Val Thr Trp Thr Gly Met Gly Ser Ala Pro Arg 

195 ZOO 20? 

Cys Pro Leu Pro Arg Pro Ala He Gly Pro Gly Thr Tyr Asn Leu Val 
210 21S 220 

VaJ. Gin ~eu Gly Asn Leu Arg Ser Leu Pro Val Pro Phe He Leu Asn 
— 5 Z3Z 235 24C 

Gin Pro Pre Pro Pro Pro Gly Pro Val Pro Ala Pro Civ Pro Ala Gin 
1**= 250 : 5 5 

A^a Pro Pro Pro Glu oer Pro A^a Gin Glv Glv 
-60 255 

' i 1 SEQ T JTOCE GHARACTERISTICL : 

'A. LENGTH: 9~ an mo acids 
3 TYPE : ammo acid 
C ■ 3 TRANS EDNES S : single 
L TCPCLGGV: linear 



HECraNCZ GESGR : PT7GN : ^EC ~2 NC 



+ ,uJ rAsn .-L^a 1, v V.i : pro :.v:; 3 ; . v 'V-t 1 Val 



3: 



WO 421 IS 
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(A) LENGTH: 364 amino acids 

(B) TYPE: ammo acid 

fC) STRANVEDNESS : single 

(D) TOPOLOGY: linear 

>x:l SEQUENCE DESCRIPTION: SEQ ID NO : 74 : 

Gly Ala Ala Val Ser Leu Leu Ala Ala Gly Thr Lei: Val Leu Thr Ala 
5 io is 

Ova Gly Gly Gly Thr Asa Ser Ser Ser Ser Gly Ala Gly Gly Thr Ser 
20 25 * 30* 

Gly Ser Val His Cys Gly Gly Lys Lys Glu Leu His Ser Ser Gly Ser 
35 40 45 

Thr Ala Gin Glu A3n Ala Met Glu Gin ?he Val Tyr Ala Tvr Val Am 

50 55 60 

Ser Cys Pro Gly TVr Thr Leu Asp Tyr Asn Ala Asn Gly Ser Gly Ala 

65 7 3 7 5 ac 

Gly Val Thr Gin Phe Leu Asn Asn Glu Thr Asp Phe Ala Gly Ser Asp 

35 9C ?5 

Val Pro Leu Asr. Pro Ser Thr Gly Gin Pro Asp Arg Ser Ala 31a Arg 
IOC 135 11C 

Gys Gly Ser Pro A. a :rp Asp Leu Pro Thr Val Phe Gly ?r: lie Ala 

115 - ~» n '■">'■; 

- — -.* s -a^ --or Lhr ~cu .-^r. .ou .asc 1.-.' Pro 




WO 99 421 1 S 



Ser Ala Gly Pro Asp Pre Val Ala He Thr Thr Glu 5er Val Gly Lys 
260 265 270 

Thr He Ala Gly Ala Lys He Mec Gly Gin Gly Asn Asp Leu Val Leu 
275 280 285 

Asp Thr Ser Ser Phe Tyr Arg Pro Thr Gin Pro Gly Ser Tyr Pro He 
290 295 300 

Val Leu Ala Thr Tyr Glu He Val Cyc Ser Lys Tyr Pro Asp Ala Thr 
305 310 315 320 

Thr Gly Thr Ala Val Arg Ala Phe Met: Gin Ala XI a. He Gly Pro Gly 
225 330 335 

Gin Glu Gly Leu Asp Gin Tyr Gly Ser He Pro Leu Pro Lys Ser Phe 
^40 345 350 

31 n Ala Lys Leu Ala Ala Ala Val Asn Ala He Ser 
355 360 

INFORMATION FOR SEQ 10 NO: 75: 

,i; SEQUENCE CHARACTERISTICS : 

;A; LENGTH: 3 09 arr.inc a::ds 
:3: TYPE : amino acH 
C STRANDEDNES3 . s::~H~ 
0' TOPOLOGY. Imear 

x: SEQUENCE DESCRIPTION. 3E0 10 NC : "5 : 





WO W 421 IS 
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■IS 



La Asp His Gly Ala ?r:j Val Arg Gly Arg Gly Pro His Arg Gly Va] 
130 135 140 



His Arc; 



r VJ 1 » 



145 



Val Phe Val Arg Arg Val Pro Gly 7al Arg 
155 isc 



Cys Ala His Arg Arg Gly His Arg Arg Vai Ala Ala Pro Gly Gin Gly 
16^ 170 175 

Asp Vai Leu Arg Ala Gly Leu Arg Val Glu .Arg Leu Arg Pro Val Ala 
130 las 190 



195 



*y Ger Gin .Arg Ala Asp Gly Arg Val 



Jhe Ar 9 :ie Arg Arg Gly A. a Arg Leu Pro Ala Arg Arcr Ger Arc 

J1 - 115 220 





WO W 4:ilS KT l S l ) l > 03265 

V) 



Ser Pro Leu Glu Arg Arg Phe Thr Cys Cvs Ser Pro Ala Val G 1 " Cvs 
50 55 60 

Arg Phe Arg Ser Phe Pro Val Arg Arg Leu Ala Leu Glv Ala Ar~ Thr 
55 7 ° 75 80 

Ser Arg Thr Leu Gly vai Arg Arg Thr Leu Ser Gin Trp Asn Leu Ser 
85 90 95 

Pro Arg Ala Gin Pro Ser Cys Ala Val Thr Val Glu Ser His Thr His 
100 105 110 

Ala Ser Pro Arg Me- Ala Lys Leu Ala Arg Val Val Glv Leu Va ' Gin 
115 -2 0 125 



Glu Glu Gin Pro Ser Asp Met Thr Acn Hi: Pro Arg -y>- ^ r ^ro Pre 

140 



-30 :35 



Pro Gin Gin Pro Gly Thr Pro Jly Tyr Ala Gin Glv Gin Sin Gin -hi 

145 ^ 155 " 160 



lyr Pro Pro Ser Pro Pro Pro Gin 



^ 17c 



Pro Thr Gin Tvr Arg Gin Pro Tyr Glu Ala Leu 



Sly Gly Thr Arg Pro 



180 155 190 

;iy l^eu lie Pro Glv Val He ? — Thr Met Thr Pro Pro Pre 31 

195 —0 2 3b 



' a - Ar ~ }ln Ar - -rg Ala Sly v.. :<e ,, Ala lie Jl 



A. a Val Thr 



He ."a. /a. ■;.->>- \ v - ■ - 

' " ~ ■ * : - ■■ ■ .--...I .^^ ji js:' ^ei: , a- 

•-' 24C 

: ' ?n " A3n Ar = s ^= -i-v -v ?r c 7a. ... 



; TV ^ ; ' — 3.v. :.- ah 

- 3 - : 9 1 ^ - - 



1 



\\() 94 42118 



50 



340 345 350 

He Ala Val Val Arg Val Gin Giv Val Ser Gly Leu Thr Pro He 3er 

35o 360 365 

Leu Gly Ser Ser Ser Asp Leu Arg Val Gly Gin Pro Val Leu Ala He 
370 375 38C 

Gly Ser Pro Leu Gly Leu Glu Gly Thr Val Thr Thr Gly He Val Ser 
385 390 395 ' 4 00 

Aia Leu Asn Arg Pro Va3 Ser Thr Thr Gly Glu Ala Gly Asn Gin Asn 
405 410 " 415 

Thr Val Leu Asp Aid Tie Gin Thr Asp Ala Ala lie Asn Pro Glv Asn 
420 425 430 



^a Leu Val As:; y«r 



.a- V; 



M^a _e rt^a Thr Leu Gly Ala Asp Ser A ' a 
4 5: 45b 



Leu Gly Phe Al.- 
470 



Pro Val Asp Glr. Ala Lys Arn He 
4 H 433 



..a ^-j.u Lcj 



435 

Lr. 7a ~ Thr Asn As;; Lvs As: 



.n ^ a je: 
490 



la Se: Leu 
4 95 



510 



ASD Lt: 




WO 99 4:i IS 



51 
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;xi) SEQUENCE DESCRIPTION: SEQ ID NO: "7: 

Met Asn Asp Gly Lys Arg Ala Val Thr Ser Ala Val Leu Val 7a' ' eu 
1 5 10 15* ~ 

Gly Ala Cys Leu Ala Leu Trp Leu Ser Glv Cvs Ser Ser Pro Lys ^ro 
20 2S 30 



Asp Ala Glu Glu Gin Gly Val Pro Val Ser P 



35 40 



ro Thr Ala Ser Asp Pre 
45 



Ala Leu Leu Ala Glu He Arc? g;- Ser Leu Asp Ala Thr Lys Glv Leu 

5 0 ^ £ . „ 



Ihr Ser Val H13 Val Ala Val Arg Thr Thr Gly Lys Val Asp Ser L 



eu 

ac 



A^a Lvs Glv Val 



105 



vy Val Pre ?he Arc: 



Ast: Asn I 



-e ^c: Va- Lys Leu Phe Asp Asp Trp Ser Asn 



■^r . .e Ser 



Ser Arg Val Leu An:: ?r 
14 0 



'. r al Thr Asn leu Gl 




WO 99 42118 
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ixi; SEQUENCE DESCRIPTION . SEQ ID NO : 7 8 : 

Val He Asp lie He Gly Thr Ser Pro Thr Ser Trp Glu Gin A.a Ala 



5 10 



15 



Ala Glu Ala Val Gin Arg Ala Arg Asp Ser Val Aso Asd lie Ara v a : 
20 25 * 30 " 

Ala Arg Val He Glu Gin Asp Met Ala Val Asp Ser Ala Gly Lyn lie 



35 40 



45 



Thr TVr Arg He Lys Leu Glu Val Ser Phe Lys Met Arg Pro Ala Gin 

60 



SO 55 



Pro Arg 
5 5 

INFORMATION EOR j E q :l 

i sequence characteristics: 

( A LENGTH : 6 9 a ™ i nc acids 

TYPE: am.no acid 
(C: STRANDEDNESS : smqle 
-C TOPOLOGY linear 

XI SEQUENCE 0ESC?.I?~IO;; SEQ ID NO : " 9 : 

Val Pro Pro Ala ?:o Pro L~ u ? rc u ro ^ eu p _ 3er prQ ;1g ^ 

A ^ :or 7r ~ ?r = Pre Pre .Ma ^ro -re Va.. Ala 

^' ~' r "' ■ r " ' Aer Pr ~ r " : "- — - - : ^P Pre : rp Pre Pro A_a Pre " J re 



-eu . J ro Tvr -v,^ , ..... . . 

- - - - : - .- : j re _tiu Pre ^re Jer Pre '^e 

5 



iE. UENCE ^ARACTEPISTICI . 
A LENGTH: ISz .ir:r.: - 
~ IV PE arr.mz i c ; 

iCP.ANEEENESS : . : . 



WO W 42118 
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Val Leu Ma Aid Val Gly Leu Gly Leu 



Leu Ala Thr Ala Pro Ala Gin 



Ala 



30 



Ala Pro Pro 
35 



Leu Ser Gin Asp Arg ?he Al 



AC 



a Asp ?he Pro Ala Leu 



co Leu Asp Pro Ser Al 



50 



a Met Val Ala Gin Val Ala Pr 



55 



o Gin Val Val 



GO 



Asn lie Asn Thr Lys Leu Gly Tyr 



65 



nv lie V 



70 



/ Tyr Asn Asn Ala Val Glv 



75 



Ala Gly Thr 



a- He Asp Pre As:; Glv 
95 



Val Val Leu Thr Asn Asn Hi 



90 



8C 



5 Veil 



^ly A* a Thr Asn lit 



*la Phe Ser Va. Glv 



y Ser Glv 



*a^ Va_< Gl' 



Asp Arg Th: 



--n .-iSc '.'al Ala 



Ler: Gl: 
130 



?;„ ^ V G:y "° v - prc A.a Ala He Glv 

14C 



Gly Val Ala Val Gl- 



14 5 



'-'al Va : A] a M 



31 V Thr ?ro Ala Val Pro Glv 



e - Asn Ser Glv 

150 

Arg Val Val Ala Leu 



190 



v Jlv ?r: 



Val Asn 31- 



-j - n Va 1 Val 3 1 vfp- 



■ r ■ ■ * a . 
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305 310 315 32c 

Ala Leu Asn Gly H : s His Pro Gly Aso Vai lie 3er Val Asr. --5 31- 
325 330 335 

Thr Lys 5er Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ai 



340 345 

Pro Pro Ala 
355 

INFORMATION FOR SEC ID NO : 8 1 : 

. i. SEQUENCE CHARACTERISTICS: 

;AJ LENGTH: ZCS amino acid5 
■;3> TYPE: ammo acid 
(CI STRAND ED NESS : sinaie 
D) TOPOLOGY : Linear 

xi SEQUENCE DESCRIPTION; SEQ ID NC : 3 1 

Ser Pre Lys Pre Asp Ala Glu Glu G 1 - ^ 

i s r.'' 



a 
350 



Pre Val 



Ser Asd Pro Al 



a ^eu leu Ala Glu He Arg Gin $ ez 



...r .ys S-y Leu Thr ier Va. Hid 7a: A. a Val Ara Thr — - r - 

; S " 3er - SU " eu j: " ::e A- a Asp Va, Aoc Va. Ara A. a 




WO *)9 42] IS 
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s ^ 



Leu Thr Gin Ser Lys Trp Asn GIu Pro Val Asn Va 

INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 86 amino acius 
(BJ TYPE: anuno acid 
(CI STRAND ED NESS : single 
;d; TOPCLCGY: linear 

x:. SEQUENCE DESCRIPTION: SEC ID NO: 82: 

Sly Asp Ser ?he ?rp Ala Ala Ala Asp Gin Met Ala Arg Gly ?he Val 

5 10 * n 



^ MSP 

:o5 




I-e Xaa 



.-i.-. a Arg Met C\ 



Asn Pre Glu As 



He Phe ?ne 

3 0 



\\<)9<J 4211S PCTrs<>*r03265 



"° 215 220 

JIv Phe Ser Asp Thr Arg Pro Ala Gly Arg Arg Tyr Phe Asn Thr Asd 
230 235 



Ala Glu Ser Gin Val Gly Arg Gly Phe Gly Arg Gl 



y Trp Pro Glv Arg 
245 250 255 

Arg 7al Asn lie Asp Pro Phe Gly Ala Gly Arg Gly Pro Pro Ala Gin 
260 265 27 o 

Leu Pro Gly Phe Asp Glu Gly Gly Glv Leu Arg Pro Xaa Lys 
: - 2SC 285 

INFORMATION FOP 5 HQ ID NO: a 3: 

3 E CUKNCE GHARACTER IST1GG : 

> LENGTH : : 7 1 ami no a c:ds 
3) TYPE: amino acid 
. 3TRANEEDNESS : single 
TOPOLOGY: linear 

xi. SEQUENCE DESCRIPTION-: SEO ZZ NO : 3 j : 

Thr Lys Phe His Ala Gin Glu Gin He His Asn Glu Phe Thr 

* ^ 15 



A^a Ala Gin Gin Tyr Val Ala He A I n Val 



yr Phe Asp Ser Glu Asp 

3 0 



DUn 1 - 



r. Ala val Glu 



4 



WO 9 'J 42 11 8 
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INFORMATION FOR 5EQ ID NO: 34: 

1 SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1C amino acids 

(B) TYPE: ammo acid 

lC) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

'•Xl: SEQUENCE DESCRIPTION: 3EQ ID NO:34: 

Ar^ Ala Asp Slu Arg Lys Asn Thr Thr Met Lys Met Val Lvs 3er 



Aid .Ala -SI' 



jfc?lJ Thr Ala Ala Ala Ala lie Gly Ala Ala Ala Ala Gly 



-a; Ja^ Tvr 



>5 



.'3 1 Val Phe Gly Aid Pr- Lpt: ?r: Ie'_: Asp Pro Xaa Ser Ala P 
50 55 ' 60 



Aaa 



Pre Thr A^a A_a Sir. T: :; "Thr Xaa Leu Leu Asn Xaa Leu Xaa Asn 



'3 



7 ^ 9 0 



rc nsn v'a^ Ser Phe Xaa As:; Lys Sly Ser Leu Val siu Giv Sir lie 
— V 31 V -'La a Si u S 1 y X a a X a a A r -J r u x a a vS 1 n 



: formation s:r ; sec :s xs:^ 
sequence characteristics. 

A. LENGTH: IS 5 amine acids 

■31 TYPE: amine acic 

c: STPJYLXSEDNESS : single 

D / TOPCLSGY- linear 



W O 99 421 IS ['C I I S9*) 03265 



31y Ly3 Asn Arg Arg Leu Cys Arg THr Pro Ser Ser Aim Gin Ar; Gl 



85 



Jlu Leu Gly Val Arg Trp lie Pr 



90 



o Arg Ser Arg Cys Ala Cvs Va 1 ~v 
100 105 110 " *" 

Val Gly Has Arg Ala Arg Arg Gly Thr Tyr His Arg Arg 

115 120 12S 

INFORMATION FOR SEQ ID NO: 96. 

i, SEQUENCE CHARACTERISTICS: 

: .A, LENGTH: 117 ammo acids 
■.3> TYPE: ammo acid 

— T 1-LvND LL * i E 3 G . ^ *.-»'-^lti 
TOPOLOGY: linear 

X - SEC'JEJICS DESCRIPTION: SEQ 12 NO: 86: 

-'•'3 Aso Ala Val Me*" G'" PHo 'r- r 1 .. - ^ ~i 

* -* - ^ T -y v.-.*/ t^rc Leu Ala Val 

5 10 1; 

/a- Asp Glr. Gin Leu Val Thr A*- a v f i ' ^ * ^ - 

^m Ala A.a Aid Val ?rz Val Va. Pne Lou Tar Ala Trp Ty- ^l" Le- 
3j - 45 



.la Asp l.su Al£ 



Val " j r.e Val 
35 



WO [> rT IISW.0.1265 

sg 



Met: Tyr Arg ?n e Ala Lvs Arg Thr 



^eu Met: Leu Ala Ala Cvs lie Leu 

10 



la Thr Gly Val Ala Gly Leu Gly Vdl Gly Ala Gin Ser Ala A a Gin 

^5 i0 



Thr 



Pro Val Pro Asp Tyr Tyr Trp Cys Pro Gly 



- Gin Pro Phe Asp 

35 40 A* 



Pro Ala Trp Gly Pro Asa Trp Asp Pro Tyr Tax Cys His Asp Asc Phe 
50 ^5 60 

I 1 ! 3 ^ p 3er ASP J - ?ro Asp Hi 3 Ser Arg Asp Tvr Pro Glv Pre 

o:i ^0 -r 

~e Leu Glu Gly Pro Val Leu Aerj Ago Pro Glv Ala Ala Pro d-o °ro 

35 - 
Pro Ala Ala Jly :-l y v_ y Ald 

LFGP^ATIO?: EC? SEC IL :;-:83: 

i' 3SCUHNCE TliAPACTHRlSTlEG ■ 

» A ■ L EL T GTH . 3 8 a rr. i no a c : a s 
'3! TYPE: ammo acid 
" ! 3 TRALT EL NX3 3 a : 1 e 



3EQLENCE 



WO 99.421 IS 



60 
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,'D) TOPOLOGY, linear 
.XI. SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Thr Asp Ala Ala Thr leu Ala Gin Glu Ala Glv Asp. ?he Giu a— -•«. 

Ser Gly Asp Leu Lys Thr Clr. lie Asp Gin Val Glu Ser Thr Ala G1-- 
J0 25 3 0 



Ser Leu Gin Gly Glr. Trp Arg Glv Ala Ala Gly Thr Ala Ala Gin Ala 

35 40 45 

Ala Val 7a: Arg ?he Glr. Giu Ala Ala Asr. Lvs Gin Lys Glr. 3* " eu 

•° 55 go 

?f jlU ,er T " r Asa :: * 31 r. Ala Glv Val air. Tyr 3e: Arg 



Ala Asp Glu GL;: 3- 31n Gl:i Ala leu Ser Ger Gin Met Glv 

INFORMATION FOR SEC :0 TiC : 90 : 

i SEQUENCE CHAKACTERISTTGS: 

■.A; LENGTH: 166 ami nn acids 
''3) TYPE : arr.inc acid 
: C ; STPANAOEE^ES 3 : s : :;u i e 
.'D 1 ' TOPOLOGY: Im-di 

sequence oeocr:?":o:; : sec :r ::c-9c 

■-'er. Thr "in Jer Sir. Thr '.'a 



: '-'a- Asr -In J In :Jlu :i 



e _eu .\cr. 



^ A. a Asn Glu Va . Glu Al i Pro 



ALi Asp Pre Thr .^sc '.'a. 

■ ' ''-^a Asr: Aid A . . • : ^ 



Ar 7 



w i Thr .j»r A-;. A:u Asr: A 1 1 A . Ay i Xaa 



^ - a A 1 • 



Aur AAi At* , A:;p A. 



WO ^0 42 J 18 
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Asn Phe Met Asp Leu Lys 
130 



31u Ala Ala Arg Lys Leu 3lu Thr Glv Asd 
^35 140 



Gin Glv Ala Ser Leu Ala His Xaa Sly Asp Gly Trp Aun Thr Xae 



145 150 

Leu Thr Leu Gin Gly Asp 
ICS 

INFORMATION FOR SEQ ID NC:91. 

l SEQUENCE CHARACTERISTICS : 
(Ay LENGTH: 5 amino acids 
; 3) TYPE: ammo acid 
:CI STRAND EDNESS : single 

SEQUENCE DESCRIPTION: 3 EC ID NO : 9 1 

A r g A I a G I u Ar g Met 

5 

INFORMATION FC? SEC ID NO : 92 : 

1 SEQUENCE CHARACTERISTICS : 

i 'A; LENGTH: :53 amine aciuc 
Oi TYPE: am: no acid 
i 0 ■ 3 TRANO ED NE S 3 ■ single 
:D: TOPOLOGY: linear 



155 



160 



Thr Ala Tvr 01 v Leu Th: 

3 0 




W O W42 1 IS 
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~ - -> ; o 1 2 s 

Asn Asn Val Pro Gin Ala Leu Lys Gin Leu Ala Gin Pro Thr G 1 ° 31- 
iJ0 135 140 ^ 



Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lvs Thr Val Ser =ro 
145 150 155 * ieo 



His Arg Ser ,Prc He Ser As:, Met Val Ser Met Ala Asn Asn His Met 

165 170 175 

Ser Me: Thr Asn Ser Gly Val Ser Met Thr Asn Thr Leu Ser 3e- Met 
190 185 190 

Leu Lys Gly ?he Ala Prc AliI Ala Ala Ala - ln AIa ^ ^ ^ ^ 
195 200 20S 



la Gl:: As;: Gly Val Ar^ Ala '-let 

210 



-er Ger .eu Gly Strr Ser Leu 12 

:lc 



Ser Ser Gly Leu Gly Gly Gly Val Ala Al 



225 



a Asn Leu Gly Arg Ala Al£ 



235 



I 4 0 



Ger Val Arg Tyr Gly H:s Arj Asp 

Arg Arq Asn Gly Gly ?:-; Al l 
163 

::;f grmatigx top. geq il r;c : 93 : 



vj-y Lys Tyr Ala Xaa Ser Gi\ 
25C :55 



il-le:;cg 

t\ -.^NCTK : 111 :inu::c ic:;;:r> 
= TYPE, arcmc i::: 
1 GTSAirDEDNESG . 3 1 nc 1 e 
-GPGLOGV: linear 



xi J^iNCE LESCR1TT1LN SEC 1L :;~ ■ ?7 



WO 99 -121 IS 



6> 
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^ 90 95 

Pro Lys A^a Lys Ser His Val Thr Val Val Ala Val Leu 31 v Va ' 'eu 

log L05 ::c 

Gly Val ?he Leu Met Val Ser Ala Thr ?he Asn Lys Pre Ser Ala Tyr 
115 120 125 

Ser Thr GJy Trp.Ala Leu Trp Val Val Leu Ala Phc He Val Phe Cln 
130 135 140 

Ala Val Ala Ala Val Leu Ala Leu Leu Val Glu Thr Glv Ala He Thr 
145 150 155 * " ;6C 

Ala Pre Ala Pro Arg Pro Lvs ^h~ As^ ^-o ^->- G 1 - ~*,v- -v, 

155 17 0 ~ " ^ 

Tyr Gly 31;; Tvr 3 I y Sin Tyr 3.Lv Val 31;. Pro 3 1 v 3 : y ~vr Tvr Glv 

,7.r. -oln o-y .^a Sir 3m A^a Ala Gly Leu Gin Se: Pro Glv Pro 

: s ~ 



31:: Ser Pro Gin =>ro Pre Glv 
210 215 



v Ser Gin Tyr Gly Gly TVr 3t 



Ser Se: Pro S^r 3_n Se r G 1 - o- -s- -> - 

2ZS 230 ^35 240 

Sin :-ro Pro Ala 7-. Ser Gly ,er Gin Gin Ser His Gin Sly Pro Ser 

3 4 = z 5 c . ^ c r 



Ala Pro Ya^ A:::; Tvr Ser Aso 



"P3L3GV linear 



\\(>99 Mill* 



6-1 



3TCTTCGGC3 CGCOACTGCC GTTSGACCCG GCATCCGCCC CTGACGTCCC GACGGGCGGC 
CAGTTGACGA 3CCTGCTCAA CAGCCTCGCG GATGCCAACG TGTCGTTTGC GAACAAGGGC 
AGTC7GGTCG AGGGGGG CAT CGGGGGGACG GAGGCGCGCA TCGCCGACCA CAAGCTGAAG 
AAGGCCGCGG AGCACGGGGA TCTGCCGCTG TCGTTCAGCG TGACGAACAT CCAGCCGGCG 
GCCGCCGGTT.CGGCCACCGC CGACGTTTCG GTCTCGOGTC CGAAGCTCTC GTCGCCGGTC 
ACGCAGAACG TGACGTTCGT GAATCAAGGC GGCTGGATCC TGTCACGCGC ATCGGCGATG 
3AGTTGCTCC ^GGCCGCAGG GAACTGA 

INFORMATION FOR 3EQ ID MO : 95 : 

i SHC'JEKCi: CHARACTERISTICS : 

^. ~.ENGTK : 15 3 ami:io at:: an 
>3) TYPE : ammo acia 

3 STRANDEDNEG5 : single 
IE- TOPOLOGY: linear 

DESCRIPTION ; 3 EC IE NO: 95- 

Me-, r.vs Men Val Evs 3c- - ■ * ** - — 

- ° -"^ a — -e:i Thr A^a A. a Ala A! 



■ a Ala Vai Thr 3e: II- M-r. Ala Gly Gly ? — 

- 5 1 o 



130 

240 

300 

360 

420 

480 

507 



WO W421 IS 
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Glu Leu Leu Gin Ala Ala Glv Asn 
165 

.2; INFORMATION FOR 5ZQ ID NO : 96 : 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 500 base pairs 
(BJ TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

;xi; SEQUENCE DESCRIPTION : SEC IG MO : 96 : 
::TGGCAATG --TTGACCG TCGGGGCCGG GGTCGCCTCC GCAGATCCCG TGGACGCGGT 
GATTAACACC ACCTGCAATT ACGGCCAGGT AGTAGCTGCG CTCAACGCGA CGGATCCGGG i; 
OGGTGCCGCA CAGTTCAACO CCTCACCSGT GGCGCAGTCC q — — 

J3CACCGCCA GCTCACCGCG CTGCCATGGC CGCGCAATTG CAAGCTGTGC CGGGGGCGGC 
A G T A C A 7 C GGCCTTCTCG AGTCGGT7GC CGGCTCCTGC AAGAACTATT AAGGCCATGC 
0GGCCCCA7C CCGCGACCCG 3CATCGTCGC CGGGGCTAGG CCAGATTGCC CCGCTCCTCA 
-GACC CGCCATCGTC GCCGGGGC7A GGCGAGATTG GCCCGCTCCT 



^GGGCG GATC7CGTGC CGAATTCCTG CAGCCCGGGG 

tnfgpmaticn ^c:~. geq :g :;c : 9- : 
"Cuencz :haracterigt::c . 

..-\; ^ENGTH: ^6 amino acicis 

3 j 7*Y?E: amino acia 

C' GTPAND EDNESS : Jingle 

G topology : linear 



5 0 



24C 
3 00 
3 60 
-*20 

TGTAGAGCG 480 

5 00 



-'a., Asp Al.-i 



■'3 Asr. ?yr 31 y G1:; 



■i A:a 

4 r 



wo 421 is PCTrsw.nj: t ,5 

f>6 



Gin Tyr lie Gly leu Val GIu Ser Val Ala Gly Ser Cvg Asn Asn TVr 
95 90 95 

INFORMATION FOP 3EQ ID MO:98- 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 154 base pairs 
(P) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi! SEQUENCE DESCRIPTION: 5EQ 12 NO : 98 : 
A T G A C AG AG C AG CAG T GG AA TTTCGCGGGT ATCGAGGCCG CGGCAAGCGC AATCGAGGGA SO 

j^^r^. ^ivjvj o^GGTAGOGG CTCGGAAGCG CACC 

: INFORMATION FCR SEQ II NO : 99 : 

i SEQUENCE CHARACTERISTICS: 

;A; LENGTH: 5 1 amino acias 
(3! TYPE: arinc acia 
[Z) STRANG EDNT-IS S : single 
;0', TCPCLCGY- linear 



Jv. 



WO 99 42 1 IS 
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GCTCGAAACG CGGCACAGCG GACGGTGGCT CCGNCGAGGC GCTGNCTCCA AAATCGCTGA 130 

3ACAATTCGN GGGGGGCGCC TACAAGOAAG TCGGTGCTGA ATTCGNCSNG TATCTGGTCS -40 

ACCTGTGTGG TCTGNAG CCG GACGAAGCGG TGCTCGACGT CG 2SC 

[2 1 INFORMATION FOR SEQ ID NO: 101: 

i l J SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3058 base pairs 
iB) TYPE: nucleic acid 
;C: STRAND EDNES 5 : single 
■;D.) TOPOLOGY: linear 

xi ; SEQUENCE DESCRIPTION: SEQ ID MO : 1 0 1 . 

:.ACJG GTGCGAGTGC TCGGGCCCTT TGAGGATGGA 3TGCACGTGT CTTTCGTGAT nO 

CAT AC CCA GAGATGTT'GG GGGCGGCGGC TGACACGCTG CAGAGCAT 



j t uvj.«j^^..„ L.o^i^v^Gn 18 0 

240 



'GAGGTGTCG GCGCTGACTG CGGCGCACTT CGCGGCACAT GCCGCGATGT ATCAGTCC 
AGCGCTCGG GC^GC^CCGA TTCATGACGA CTTCGTOGCC ACCCTTCCEA GCAGCGCCAG 3 CO 

TCGTATGCG GCCAGTGAAG TCGCCAATGC GCCGGCCGCC AGCTAAGCCA GG AACACTCS 360 



' - ■" *- ^ — i-rvLji-iJAj-L _ in.Jrtnn . .•^Gvjvj.mC AC -~ 

ivj.-ii -rtrt^^^. _ o^-^jAGG ATf: TACSCCGGCC 

j i. -rvu w.jTGG CG AG""GAG"~"G™ 

- » jj-ju ^ _ . jACGGTGGGG tgotggatao 

■ — — C wTA TGTGGCGTGG ATGAGCGTC- 



juuu^.\ jGCCGAGCTC ACCGC'CGCCC t60 



C3A7GACCAA C7CGGG7G7G TCGA7GACCA ACACC7TGAG C7CGA7GC7G AAGGGCTTTG 12 00 

C7CCGGCGGC GGCCGCCCAG GCCGTGCAAA CCGCGGCGCA AAACGGGGTC CGGGCGATGA 12 60 

GC7CGC7GGG CAGCTCGCTG GGTTCTT CGG CTC7GGGCGG 7GGGG7GGCC GCCAACTTGG 132 0 

37CGGGCGGC C7CGG7CGG7 TCGTTGTCGG TGCCGCAGGC CTGGGCCGCG GCCAACCAGG 13 3 0 

CAGTCACCCT GGCGGCGCGG GCGCTGCCGC TGACCAGCCT GACCAGCGCC CCGGAAAGAG 144 0 

GGCGGGGGCA GA7GC7GGGC GGGCTGCCGG TGGGGCAGAT GGGCGCCAGG GCCGGTGGTG 15 0 0 

3GC7CAG7GG 7G7CC7GCG7 G7TCC3CCGG GACCC7ATG7 GA7GCCGCAT 7CTCCGGCGG 15 6 0 

CCGGC7AGGA GAGGGGGCC.C AGACTG7CG7 7AT7TGACCA GTGATCGGCG GTCTCGGTG7 1S2C 

- ^^^^^^^^ ^nu l v_j-^a . j . ...j^rt ^ jACAA G77ACAGG7A 27AGGTCCAG -od'J 

277CAACAAG GAGACAGG-GA ACA7GGCC7G ^CGTTTTATG ACGGATCGGC AC3CGA7GCG — 4 c 

G3ACA7GGCG GGCCG7777G AGGTGGAGGC GGAGACGGTG GAGGACGAGG CTCGCCGGAT 13GC 

C7GGGCG7CC G GG CAAAACA T77CCGG7GC GGGC7GGAG7 GGGA7GGCCG AGGCGACCTC 136 0 

GGTAGAGAGG A7GGCCCAGA 7 G AA7 GAG GG 3TTTCGCAAC ATCGTGAACA TGCTGCACGG 192C 

GG7GCG7GAC GCGC7GG7TC GCGACGCGAA CAACTACGAG GAGCA AGAGC AGGCCTCCCA I9SC 

G2AGA72C7C AGCAGCTAAC GTCAGGGGGT GCAGCACAA7 AC7777ACAA GCGAAGGAGA 2 04: 

^ GG 2GAGG 2CGGG77GC7 GG AGO CGG AG 2A7GAGGCCA 7CA77CG7GA 7GTG77GAGC "\tr 

■ : "G GAG G7GA7 27ACGAGGAG GCCAACGCCC AGGGGGAGAA GG7GCAGGC7 2 2 3 C 
^ -»"~-.'3::GGA AACCGACAG~ :G~G7CG3C~ "AG;- GGGC TGACACCAG 2 14 : 



WO W42I IS 



3GTGC3CACC CACGGCCACC AGGGCTTCGG GGTGGCTGGG ATCAGATTGG TTGCG TAG TG * 8 8 0 

GGTTCTGCAG CGCTGCCAGG CCGCTGCGGC CAGGGTGGCG CCGATCGCGG "CACGAGGCG 2 94 0 

GGGGTGGGCG TCGCTGGTGA CGAGCGCGAC CCCGGACAGG CCGCGGGCGA CCAGGTCGCG 3 0 CO 

GAAGAACGCC AGCCAGCCGG CCCCGTCCTC GGCGGAGGTG ACCTGGATGC CCAGGATC 3058 

iZ) INFORMATION FOR SEQ ID NO: 102: 

: i ; SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 391 amino acids 
i3) TYPE: amino acid 
;C; STRAND EDNES S : single 
'D) TOPOLOGY ■ linear 

xi SEQUENCE LESCR TPTICN : JHC. ;g NO ; 1 2 2 ■ 

Met Val Asp Phe Gly Ala Leu Pre Pre Gl 
-> 1 C 

Tyr Ala Gly Pre Glv Ser Al^ .>r 



I^e Acn Ser Ala Ar? Met 

1 5 



al Ala Ala Ala Gin Mec 
30 



Asp Ser Val Ala Ser A^ 



e.; .^r A^a A^a Ser Ala Phe 'Sir. Ser 

35 ,c 

Val Trr: Gly Leu Thr V a : G'. ; S-r Trp : : e Glv Ser Ser Ala Glv 

50 55 *c 



WO 9 f > 421 IS 



0 



|>C T' I 'SW03 2 fo 



Aso Thi 



Ua ^ a Ala A3n G — Met Asn Asn Vai Pro Gl 



195 



200 



n rt^a Leu 



205 



Leu Ala Gin Pro Thr Gin Gly Thr Thr P 



215 



ro Ser Ser Lys Lei 



22 0 



Gly Gly Leu Trp Lys Thr Val Ser Pro His Arg Ser Pro lie Ser 



23C 



235 



Asn 
240 



Met Val ser Met Ala Asn Asn His Met Ser Met Thr As 



245 



3er Met: Thr Asn Thr Leu 



n Ser Glv Val 



250 



260 



^a Ala Ala Gl;i Aid Val 



: Ser Ser Met Leu Lys Gly Phe Al 



255 



a Pro Ala 



?65 



il- Thr Ala Ala Glr 

13 0 



210 



Asn Gly Val Arg Ale 



.•let Ser Ser Leu Gly 

190 



v Ser Ser Leu Gly Ser Ser Gly Lev: Glv 



:?5 



■•a. «.:a Kia Asn Leu Gl v 
3 0 5 3 10 

Pro SI- A^a Trp Ala Aid 
325 



A^a Ser 



3 CO 

Gly Se: Lt-i; = 



3 15 



^e: V a ^ 

320 



ll1 A2n '''al Thr Pro Ala Ala Arg 

330 



A^a Leu Pro Leu Thr Ser Leu Thr Ser 
3-0 " - * 



Ala Ala Glu Arg Gly Pro Gl;, 
35C 

3~r. Met Ala Ara aIj 01'. 



J r- Tvr 



J * ■.■ .!*»r Pre 

'35 



5 9C 



;ANLGEDKHSH 01; 
TOPQLSGY linear 



WO 4<> 421 IS 
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CGTGTTCGGG 7C3ATTTGGC CGGACCAGTL' ™=AC=AAC3 CTTCCCGTGC GCGCCAGGCG 
3CCGATCAGA TCGCTTGACT ACCAATCAA7 CTTGAGCTCC CGGGCCGATG ~CGGGCTAA 
ATGAGCAGGA GCACGGGTGT CTTTCACTGC GCAACCGGAG ATGTTGGCGG CCGCGGCTGG 
CGAACTTCGT TCCCTGGGGC CAACGCTGAA CGCTAGCAAT GCCGCCGCAC CCGTCCCGAC 
GACTGCGGTG GTGCCCCCCG CTGCJGAC3A G\JTG rCGCTG CTGCTTGCCA CACAA7TCCG 
TACGCATGCG GCGACGTATG AGACGGCCAG CGCCAAGGCC GCGGTGATC G ATGAGCAGTT 

"""^^ - ^^ToCSGAC ACCGAGGCCG CCAACGCTCT 
^C^^<^ -A^.^LCC GACGGTATTC GAG CGGAAGG ATTATCGAAG TGGTGGATTT 

-* -"-^urt^o^rti^ Gl^gogcjgg .rrTCGGCCT"'" 

~. ^.oo^^ Ji ^ A ^ww\CAG CG7GGCGAG7 GACC7G7777 JGGCCGCG7C 

v_-vj - jvj^vj^ Co 7GGA7AGG77 CG7CGGCGGG 
7CTGA7GGCG GCGGCGGCC7 CGCCG7A7G7 CGCC7<GGA^ AGCG^ A--— — 



G C"-CC^GG 7CCGGG_GC 7GCGGCGGCC TACGAGACAG CG7A7AGGCI 

_o_^7GA 7CGCCGAGAA :CG7ACCGAA CTGATGACGG ~GACCGCGAC 

^v-^.-^u-w-i^.-. ^....n^Cjrt: 2GAGGC2AA7 CAGGCCGCA7 AGAGGGAGAT 

J7G GGGCCAA GACGCGGAGG "GA7G7A*~~G — — — ~~ , r ^„„^_ _ 

-^1^^^^^ jvj3C3 AACCAG77GA 7GAACAATG7 



— ^o^G -7GCAACAGC Z3GZZ2Z 



24 C 

3C0 

360 

420 

48C 

540 

600 



900 
36C 



: J 3 J 



WO <W 421 18 
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(A; LENGTH : 3 5 9 ammo acids 

(B) TYPE: amino acid 

iC) STRANDEDNESS : 

(Dj TOPOLOGY: linear 



:xi SEQUENCE DESCRIPTION : SEQ ID NO: 104: 
Val Val Asp Phe Gly Ala Leu Pro Pro Glu He Asn 



Ser Ala Arq Met 



5 10 15 

/r Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Lvs Met Trp 

20 25 30 

;p ^'er Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Sin Ser 
3 5 40 45 

-a^ ^1. Thi Val Gly be: :L-<p Glv Ser Ser Ala Slv 



- Ala Ala A^a Ala Ser Pre 



r*yr Val Ala Trp Me^ Ser Val rhr 



-y Gm Ala Sin Leu Thr Ala Ala Gin Val Arg Val A;a Ala Ala 

3 5 Qn 



2 ro Pre 



Asn Ar^ 



rnr Ala Thr Asn Lc 



; 4 o 



■Me t. 



\sd Ala 



WO W42118 
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Ser Mer Thr Asn 
Ala Ala Gin Aid 

Ser Scr Leu Gly 

290 

Gly Ala Gly Val 
305 

Leu Ser Val Pro 



.Via Ala Arg Ala 



Thr Leu His Ser 



Val Glu Thr Ala 
230 

Ser Gin Leu Gly 
295 

Ala Ala Asn Leu 

310 

Pro Ala Tr;; Ala 
325 

Leu Pro Leu Thr 



Met Leu Lys Gly 
265 

A_La Glu Aon Glv 



Ser Ser Leu Gly 

300 

Gly Arg Ala Ala 
31b 

Ala Ala Asn Gin 

330 

Ser Leu Thr Ser 
34 5 



Leu Ala Pro Ala 
270 

Val Trp Ala Met. 
285 

Ser Ser Gly Leu 

Ser Val Gly Ser 
320 

Ala Val Thr Pro 
3 3 5 

Ala Ala Gin Thr 

iSO 
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77CC3 G7GGG GCGAAAGACG AACCACG7GC GTGGCGGC7C AAAC7G A CCG TGACCGAAGA 780 

GGGGGGACAG TACAAGATGT CGAAAGTTGA GTTCGTACCG TGACCGA7GA C3TACGCGAC 3 4 C 

GTCAACACCG AAACCACTGA CGCCACCGAA GTCGCTGAGA TCGACTCAGC CGCAGGCGAA 900 

GCCGGTGATT CGGCGACCGA GGCATTTGAC ACCGACTCTG CAACGGAATC TACCGCGCAC 96C 

AAGGGTCAGC GGCACCGTGA CCTGTGGCGA ATGCAGGTTA CCTTGAAACC CGTTCCGGTG 10 20 

A7TC7CATCC TGCTCATGTT GATCTCTGGG GGCGCGACGG GATGGCTATA CGTTGAGCAA 10 80 

7AC3ACGCGA TCAGCAGACG GACTCCGGCG CGGCCCGTGC TGCCGTCGCC GCCCCCTCTC 114 0 

ACGGGACAA7 GGCGCTGTTG 7G7A77CACG GGACACG7CG AC GAAG ACT 7 CGC7ACCGCC 12 OG 

GTGGGGGGGG 7 AAA C A GAAG TC AC7GAAA.A 37ACCGCCAA 337GGT3CGC G7GGCGG7G7 1 3 2 Z 

GGGAGC7ACA 7GGGGA77CG GCCGTC37TC 7GG77T77G7 CGACCAGAGC ACTACCAGTA 1 3 B 0 

AGGACAGCGC CAATCCGTCG ATGGCGGCCA GGAGCG7GA7 GG7GACGC7A GCCAAGGTCG 144 C 

AC GG CAATTG GC7GATCACC AAG7TCACCC GGG7TTAGG7 7GCCG7AGGC GG7CGC GAAG 150C 

TCTGAC3GGG 3CGCGGGTGG C7GC7CG7GC GAGA7ACCGG CCGTTCT™ GACAA7GACG 15o0 

GCCCGACC7G AAACAGA7C7 CGGCCG "737 G7AA7CGGCC G G G 7 7 A 77 7 A AGA77AG77G — 

. ■ .n.-.w . j'«7 -^TAGTTACG JCCGAGCGGA AGGA77A7 3G AA37G37GGA 3 0 C 

: "7- : J j JGJG 7:AGGAC7GG ACATCAAC7G 3 3CGAGGATG 7ACGCGGG3G G3GG77CG3C ; 3 r> 3 

' - J - T " j:j j "GG'J'JGGGA AGA7G7GGCA 3AG ~G~GG 33 -V37GAGC73- ~~:~3337G~ : 

~ 1 ' -"^ w ^- * - r> - j ^ - - -.j - - _ , „■ — . — — ^ 0 ^ n ^ 2 1 . _ 



WO W42II8 
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ICCCAA GCG^TGCAAC AACTGGC2CA 3CCCACGAAA AGCATCTGGC C3TT2GAC2A 2460 



520 



ACTGAGTGAA 2TCTGGAAAG 2CATCTCGC2 GCATCTGTCG CCGCTCAGCA ACATC3T3T2 

GATGCTCAAC AACCACGTGT CGATGACCAA CTCGGGTCTG TCGATGGCCA GCACCTTGCA 2 58 0 

CTCAATGTTG AAGGGCTTTG CTCCGGCGGC GGCTCAGCCC GTCGAAACCG CGG CGCAAAA 264 0 

CGGGGTCCAG GCGATGAGCT CGCTGGGCAG CCAGCTGGGT TCGTCGCTGG GTTCTTCGCG 2 700 

TCTGGGCGCT GGGGTGGC 23 GCAACTTGGG TCGGGCGGCC TCGGTCGGTT CGTTGTCGGT 2 76 0 

TGGGCCGCGG 7CAAC2AGG2 2GTCACC2CG GCGGC^CGOG 2GCTGCCGCT G82C 

GACCAGCCTG ACCAGCGG2G 2GCAAACCG2 72C2GGACAC ATGCTGGGCG GGCTACGGCT 2380 

GGGGGAACTG AC2AATAG23 -GGCGGG~~ 7GG23GGGTT AGCAAT3CGT 23C33AT3CC 2340 

GCCGCGCCCG TACGTAATGG 2C2GTGTGC7 73CC3CCGGG TAAC3CCGAT G C G CA CG CAA 

^ j^GGGGCGT CTATG2CG0 2 AG 23 AT 2 

2 I M FORMAT I 2 IJ FOR 2E2 ZZ N'C:1j6: 

: SEQUENCE 2HAR.ACTERI3T: 23 : 

.A; LENGTH : 3 36 ar,::.; ac:as 
1 3; TYPE : amino acia 
:g; 3 TRANDE 3 NEGS : 
- 1 rCPGLG-GV . .ir.^ar 



3000 
3 227 



-'a. Asp A ho 



A s n G e : \ 2 a A : ~ e *_ 



' T A ^ a — ~ -'I;-' Ger A.j -'.jr Lei: ".'a. Ala Ala Ala Lvs Men Tru 

~- =1 3 2 



V!"*: 7-1 V ■ . ;„ ; - • - ^ .. - 



' i . Ar': '.' i . A.j A. a 



3: 



W O 99 '421 18 
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Gin Asn Thr Pro Ala He Ala Val Asn Glu Ala Glu Tvr Glv Glu Met 
130 13b 140 

Trp Ala Gin Asp Ala Ala Ala Met Phe Glv Tyr Ala Ala Thr Ala Ala 
145 150 " 155 150 

Thr Ala Thr Glu Ala Lea Leu Pro Phe Glu Asp Ala Pro Leu He Thr 

165 170 i7 5 

Asn Pro Gly Gly Leu Leu Glu Gin Ala Val Ala Val Glu Glu Ala lie 
ISO 195 190 

Asp Thr .Ala Ala Ala Asn Glu Leu Met Asn Asn Val Pro Gin Ala .eu 
-95 200 205 

-In ulr. ^eu A^a G^n Pro Thi Lys Ser Tie Trr- Pro Phe Asn J 1 r. _eu 
:1 ° -15 220 



Ser Glu Leu Tro Lv: 



rt^a _e Ser Pro H:s Leu Ser Pre Leu der Asn 
23: -i. ~ " 



Val Ser Met Leu Asn Asn His Val Ser Met Thr Asn Ser Gly Val 
245 250 255 



Ser Met Ala Se 



Thr Leu His 3er Met Leu Lys Gly Phe Ala ?r.r, Ala 
260 265 270 



."a- G-u Thr Ala Ala Gin Asn Gly Val Gin Ala Met 

: 3 c ^ p q 



•: Ser Leu J 1 ■ 



100 



rtxa Asn Leu G 



wd ^ia jer .a^ je: 



'/a : P r 



A*a '.'a. T*hr ^r: 



i ^eu Ar 
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'i'. SEQUENCE ZliARACTERZ STZ Z3 . 

(A) LENGTH: 1516 nase pa : rs 
(3) TYPE, nucleic acid 
*C; STRAND EDNE5S ; single 
12) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

GATCGGAGGG AGTGATCACC ATGOTGTGGO ACGCAATGCG ACCGGAGTAA ATACCGCACG 

GCTGATGGCG ZGCGC3GGTZ GGG CTOCAAT GCTTGCGGCG GCGGCGGGAT GGCAGACGCT 

TTCGGCGGCT ZTQGACGCTC AGGCGOTGGA GTTGACCGCG GGC CTGAACT GTCTGGGAGA 

AGGGTGGAC7 GGAGGTGG GA GCGACAAGCG GCTTGGGGCT GCAACGCCGA TGGTGGTCTG 

1 Z T A G .AAA C C OGGTGAACAG AGGrGAAGAr ""^GTG GGATG GAGGGGAGGG :GC/.\0CC^C ; : : 

G G AT A GAG G 2AGGCGATGG OGAGGAGGGG GTGGGTGGGG GAGATGGGGG CCAACCACAT 3 60 

:.;:::agg:" gtggttaggg ggaggaagtt gttgggtatc aacacgatco ggatogcgtt 42: 

540 

500 

GTTGGGGAGT TGG GGGGGGG GG G T A G G G AG ACGCTGGGCG AAGTGGGTGA 660 



TGATOOGGGG GCGAGCGAGA G 




tO 
120 

:ac 

240 
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AATATTGGTG AGGCCGGCGT 2CAATACTCG AGGGCCGACG AGGAGCAGCA GCAGGCGCTG 15 CO 

TGCTCGCAAA TGGGCTTCTG AGCGGGTAAT AC GAAAAGAA AC GGAG CAAA AACATGACAG 15 6 0 

AGCAGCAGTG GAATTTCGCG GGTATCGAGG CGGCGGCAAG CGCAATCGAG GGAAAT 1616 

(2) INFORMATION FOR SEQ ID NO: 10 8: 

<i) SEQUENCE CHARACTERISTICS: 

(Aj LENGTH: 432 base pairs 
!B) TYPE: nucleic acid 
(C) STRAND EDMESS : single 
i D J TOPOLOGY: linear 

SEQUENCE DESCRIPTION : SEQ IE NO: 108: 

ETAGTGGATG GGACCATGGC 2ATTTT2TG2 AGTCTCACTG CETTCTGTGT 7 G A C A TT*~T G ~ n 



G2ACGC2GGC GGAAACGAAG CACTGGGGTC GAAGAACGGC TGCGCTGCCA TATCGTCCGG ; 2 0 

AG 2TTE2A2A CCTTGGTGGG GE2GGAAGAG CTTGTCGTAG TCGGGGGGGA TGA2AACCTC 1 ° C 

T 2AGAGTO 2G CTCAAACGTA TAAACACGAG AAAGGGCGAG ACCGACGGAA GGT2GAACTC 24: 

SCG C GATE 2 2 GTGTTTCG2T ATT2TACGCG AA2T2GGCGT TGCECTATGG GAA CATC CCA .3 C C 



".".HMATi„:; rOH ee2 -2 :;e : : : •< 

SEQUENCE GHAkACTERlJTIGS : 
A- LENGTH: iori arr.ir.c "icia^ 
3 TYPE: arm: ic:a 

GTP.ANG EE NEE J /; . ■/ 
1 : :p':l--jv . . --a: 



4 : 
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Leu Ala Ala Ala Thr Pro Met Val Val Trp Leu Gin Thr Ala 3er Thr 
65 70 75 80 

Glii Ala Lys Thr Arg Ala Met: Gin Ala Thr Ala Gin Ala Ala Ala Tyr 

35 90 95 

Thr Gin Ala Mer. Ala Thr Thr Pro Ser Leu Pro Glu He Ala Ala Asn 
100 105 110 

Kis He Thr Gin .Ala Val Leu Thr Ala Thr Asn Phe Phe Gly He Asn 

115 120 125 

Thr He Pro He Ala Leu Thr Glu Met Asp Tyr Phe He Arg Met: "rp 
13 0 13b 14 0 

Asn Gin Ala Ala L,eu Ala Me: Glu Val Tyr Gin Ala Glu Thr Ala Val 



Gly Ala Ser Gin Ser Thr Thr Asn Prtr He Phe Gly Mel Pro Ser Pro 

13 C 18 5 190 

Sly Ser Ser Thr ?r~i Val Gly Gin Leu ?r:) Pro Ala Ala Thr Gin Thr 

1 qc ~> C 5 

Leu Gly Gin Leu Gly Glu Me: Se: Gly Pro Met Gin Gin Leu Thr 31n 

2 10 2 15 220 



W O W 421 18 



SO 



PC 1 I S ( ) ( ) 



^55 .160 365 



INFORMATION FOR SEQ ID NO. 



<i; SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amine acids 

(B) TYPE: amino acid 

(C) ^TRANDEDNESS : 

(D) TOPOLOGY: linear 

xi SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Me- Ala Glu Met Lye Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly 
5 10 15 

Asn ?he Jlu Arg lie Ser Sly Asp Le>a Lys Thr J^n nsp j^r. .a: 



^er Thr Ala Sly 3er Leu J In Gly Gin Tm Arg Giv .Ala A- a 3iv 
3d ^ g 4 b 

Ala A _a Gin Ala Ala Val Val Arg Phe Sir. Glu Ala Ala A sr. Lys 

5 C 5 5 6 0 



,a. ,j^r: y/r h>er Arg A^a Aap 3 1 u G 1.: Glr: Gl" Gl:i Ala Leu Se: Ser 
35 9C ?5 



— *. ^ v'ii rU:< jc w .«L : .11 

IQUENCS CHARACTERISTICS. 
A LENGTH : 3 96 case pa. 
? T":'?E . :u::e:; .k: 



WO 9^ 421 IS 
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LGACGAGG GGAAGCAGTC CCTGACCAAG GTCGCA 3 96 

I FORMATION FOR SEQ ID NO : 1 1 2 : 

[l ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 0 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS ; single 
{D: TOPOLOGY: linear 

txi. 1 SEQUENCE DESCRIPTION : SEQ ID NO : 1 1 2 : 

He Ser Gly Asp Leu Lys Thr Gin He Asp Gin Vai Glu jer Thr Ala 
15 10 15 

Gly Ser Leu Gin G^v G^n :rp Arc iiy A* a .-^a Gly ihr .-s.la M.*a Gin 
2 0 2 5 ]G 

Ala Ala Val Vai Arg Phe Gin Gin Ala Ala Asn Lys Gin Lys Gin Glu 
3 5 4 C 4 5 

Leu Asp Glu lie Ser Thr Asn He Arg Gin Ala Glv Vai Gin Tvr Ser 
50 5 5 6 0 

Arg Aia Asp Glu Glu Gin Gin 1 . n Aia Leu Ser Ser Gin Met Giy Phe 

•5 5 "H "5 8 0 



::j format :gn for gel. :l !:c:::j: 

A LENGTH- 3 3" case :a:r 

3 "VPF. ::i;:>:- ic::: 

'? 7TRANDEDNES5 sing.- 

. L TOPOLOGY i near 

SHCVENCF GEG3?IP'TON: ^E2 



aCCG 



WO 4)9 42118 
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i 1 SEQUENCE CHARACTERISTICS : 

(A) LENGTH; 2^2 base pairs 

(B) TYPE ; nucie;c acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

ixi) SEQUENCE DESCRIPTION: SEC ID NO: 114: 

CCCCAACGGC GCTGGCGAGG GCTCCGTTCC GGGGGCGAGC 60 

TGCCCCCAGC CGCGCCTGGA TGGATGGACC AGTTGCTACC 12 0 

CTCTGTCCGA TAG CGGTG AC CGGGGCGGGC ACGTCGGGAG 13 0 

-^■^^GvjTTCG GCZGGGGACG GAGACGGTCT GGACGGAACG 2 4 0 

; fc rmat ion for s eq i d no : 1 1 s : 

'l SEQUENCE CHARACTERISTICS: 

(A; LENGTH: 2 J amino acids 

(B) TYPE: amino acid 

I'd S TRANCED NESS : 

;d: TOPOLOGY: linear 



CGGCACGAGG ATCTCGGTTG 
TGCGCGCCGG ATGCTTCCTC 

^ j * TGGGGGG - — A.GG C G GGG T 
-"—*>— GGGG*jTT LGCGGATTGG 



Va^ A^d Ala Lei; 

20 




WO 99 42118 l>CT/l'S90 03265 

S3 



;x;': SEQUENCE DESCRIPTION: SEC IE NC:1I7: 

Ala Ala Met Lys Pre Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 

I 5 io is 



Glu Gly Arg 



INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 
(CI STRAND ED NESS : 

(D) TOPOLOGY: linear 

>;i' SEQUENCE EESCPI FTION : SEC ~F 



?hf* Asp Pro Ala 

1 0 



^grxation for sec id no.H3: 

i' SEQUENCE ~iARACTERI3TI CS . 

(A/ LENGTH: 14 ammo acids 
(B) TYPE: amine acid 
' c ■ S TRANC edne s s . 
(D) TOPOLOGY: linear 

:i SEQUENCE EE3CRIPTICN: 3EC IC ::C:1I9; 



\\() W421 18 
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xi SEQUENCE DESCRIPTION: SEQ ID NO : 1 2 I : 

Asp Pro Giu Pro Ala Pre; Pro Val Pro Thr Thr Ala Ala Ser Pro Pro 

5 :o 15 



INFORMATION FOR SEQ ID NO : 122 : 

(i; SEQUENCE CHARACTERISTICS: 

;A: LENGTH: 15 amino acids 
3) TYPE: amino acid 
C STRAND EDNESS : 
'. D ) TOPOLOGY: linear 



SEQUENCE DESCRIPTION: SEQ ID NO 



jeu j^-.' Asn 



INFORMATION FOR SEC ID NO: 12 3. 

i SEQUENCE CHARACTERISTICS : 

1 A j LENGTH: 3 0 amine acids 
■3' TYPE : amine acid 
C: STRAND EDNESS : 
D TOPOLOGY: linear 

SEQUENCE DESCR IPTI ON 1 " : SEQ ID NC:12 3 



■o-.n ^ei 



.ier Phe .ila Asr 



■CP>lAri2N rOR SED. ID NO.. 

;-:;-.;enc:; jmapa . • 

A L^ENGT:: . .i™ in:, 

""TRAN^EDNES.- 



: d Nr 



wo go 4:1 is 



S5 
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1' SEQUENCE CHARACTERISTICS : 
(A) LENGTH: ~ ammo acids 
(D) TYPE: ammo acid 
(C) STRAND ED NESS : 
fD) TOPOLOGY : linear 

1 xi) SEQUENCE DESCRIPTION : SEQ ID NO: 125: 

Asp Pro Gly Tyr Th_r Pre Gly 
1 5 

INFORMATION FOR SEQ ID NC:126: 

ili SEQUENCE CHARACTERISTICS: 

!.AJ LENGTH: IC ammo acids 
fB) TYPE: amino acid 

C 1 ^TR^Mn^nvF^.q ■ 

I TOPOLOGY ■ :r.ear 

,:x; FEATURE: 

( D 1 OTHER INFORMATION : ,'::oc«^ "Th^ Second Residue San 



xi. SEQUENCE DESCRIPTION : SEQ ID NO .12 5: 

Xaa Xad Gly ?ne Thr Gly Pro Gin ?he Tyr 

5 IC 

INFORMATION FOR SEQ IS NC:12"V 

SEC UENCE OHARACTE RISTICO . 
A LENGTH, arr.mc ae^ns 
3' TYPE, ammo acid 



LATURS : 



WO W42J1H !»( r I 
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Xaa Xaa Xaa Clu Lys Pro ?he Leu Arg 

INFORMATION FOR SEQ ID WO: 12 9 

U) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS ■ 

(D) TOPOLOGY: linear 

■:xi; SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

Xaa Asp Ser Glu Lys Ser Ala Thr He Lys Va 1 Thr Asp Ala Ser 
1 5 10 15 

^FORMATION ~0?. SFQ ID KC : *. 1 0 : 

S E QUENCH CHARACTERISTICS : 
^Ai LENGTH: 15 ammo acids 
{ B TYPE : amino acid 
!C STRANDEDNESS . 
ID) TOPOLOGY: linear 

xi SEQUENCE DESCRIPTION. SEC ID NG:130. 

Ala Sly As d Thr Xaa He Tvr He Val Gl v Asn Leu Thr Ala Asr 



IN FORMATION FOR 3E 



2QUENCE :haracteristiss : 

A LENGTH : IE amino i c i a :: 
3 TYPE: amine acid 
S STRANDEDNESS 



IY?E amine _ici: 
STRANDEDNESS 



WO 49 42118 PC T l'S9 l >'U3265 

8" 



Asn Val I4is Leu 7a 1 

20 

2 INFORMATION FOR SEC ID NC:I33: 

,i: SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 382 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES3 : single 

(D) TOPOLOGY; linear 

MOLECULE TYPE: DNA {genomic) 

SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

GCAA.CGCCGT ^CTGGv — TTT .j'—oo ivjATCO vjCTT CG C^_ ^ C jCiouCou.'j j^, jui^'j^jj 

T SAC CATC I j .-\CC3 J rtC _ ^ v_G o-_~T. — -imAj-vG _GCT.*aGAG^_jG .-^ \CCAAn-r\C j^CC-u Aj 12'„ 

3GAAGTTCAT SCCGTTGTTw . — ^jA^.Gw\AC AG CAGG C G C C GGTCCCGCCG > — TC_G_' J 

GGCCGGCTAC CTCAC^CGGvj j .GGu T^jum GGLCGGCCTC G : _CCGCGCCG GAAlj.^G'- jG 30C 

GCGTGCCCG ; -7 iGTTG^GC. a,C^^ iG„ _^A * GCCGGT ■' '■_ AT CA ^ I ^ - „^^.^_^Gi 3oC 

TIZIwvjoTTj aCAGi — .. o u/A ."-v I -j — A>_ ■ — >\ . — LLAv. _ j ._ A — -j C ^^nC « | ..uv....jj j jA 420 
j . „ jAL^rt^'ji — - — — j ._ _ C jAC ... AC i.. ..j'jTGrtv... A^G^—o-^Arv 480 

, _ ; - -\ /vrt'w jA( . . 3 . ; . ■ m i . . jrv_ i ..... j 'J \i '. 

] J J I j — CI j A — A\~ . ^ /-vG....A G CCA^.GCC GACGACC3TG 3CTC_GwA0'_ 'jtj', 

-j ~ . Vo w\ -J^ rtL. ■ J w\./-i MM t ... . MMi.riurt . .~L_rt. „ w ^ _rt O ^. o i ^- : : J - - 



RMAT 



WO 9*) 42118 
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.x:; SEQUENCE DESCRIPTION: GE^ 13 NC:134. 

CCAT7AACCA ACCGCTCGCG CCGCCCGCCC CGCCGGATCC GCCGTGGGCG CCACGCCCGC 60 

C3CTGCCTCC GGTGCGCCCG TTGCGGGGGT GGCGGCGGTG GGGGCGGACG GGGTGGGTGC 12 0 

CTAGG GCGCT GTTACCGCCC TGGTTGGCGG GGACGCCGCC GGCACCACCG GTACCGCCGA 180 

TGGCGCCGTT ' GCCGCCGGCG GCACCGTTGC CACCGTTGCC ACCGTTGCCA CCGTPGCCGA 240 

CCAGCCACCC GCCGCGACCA CCGGCACCGC CGGCGCCGCC CGCACCGCGG GCGTGGCCGT 300 



TCGTGCCCGT ACCGCGGGCA GCGCCGTTGG CGCCGTCACC GCCGACGGAA CTACCGGCGG 
ACG7GG77TG 777GGGGGCG 7GGGCGGCAC CGCCATTGGC ACCGCCGTCA GCGCCGGCTG 42 C 

GGAGTGC7GC GATTAGGGCA 7~GAC7GGC3 AAACGAGCGC AAGTACTGTG GGTCACCG AC; 

AGGGGGTAGC TG7GGGGTGC GGTAAG777 GATCATGA7G 7CGAGGTGAC CGTGACCGCG 
CCG GCGGAAG GAGGCGGTGA ACTGGGCGT7 GAGCCGA7CG GGGATCGGTT GGGGCAGZGC 
GCAGGCGAAT ACGGGGA7AC 7GGGTG7G!;A AGCCGCGGCG AGCG7AGCTT GGGTTGCGGG 

ag:;gtggtgg gggtgg:etg ttacgcegtt 77c:— ggaac acgagtagca 

GGwGAGGGCA TCGACCACGC GTTGCG7CA3 G7CG7 

: :::fgrvat:7:; gcp ge: :e :;g.;:i; 

gequence 7:iarac7er 1 3ti eg . 

A EEIJGT:' : L1II case pa::: 

2 TYPE: nuc.eic aciz 

G. 3TRANEEENESG . 3ir.g.e 

E 70 PC LOG Y linear 

.■: . jl;.ye:;7I eege:- : : y;c:; . je; ;:. 



360 



4 30 
540 
600 
660 
^20 
^80 
315 



GGGCTAACAG : 3C 



WO 99 42 IIS 



8<> 



ACACCCGACG TGTCATACGC 3CCGCGOCT^ CGTCAGCAAG TTCACCGCAC CGACGATCCT 43 C 

gcgttgtgcc tgtcgttaag eaaccggatc gtgtcgagga agatcctgaa tcagcaggcc 54 0 

ttgattcggg cacacacgtc ggggcaagac gttgctgaga g catc cgcac gatgaagcac 6 0c 

tcgctggcct gggtcgatcg atcgggctcc ctggcggagt tgaacgggtt cgagggaaat 66 0 

gccgcaaagg catacttcac cgcgctgggg catctcgtcc cgcaggagtt cgcattccag 72 c 

ggccgctcga ctcggccgcc gttggacgcc ttcaactcga tggtcagcgt cggctattcg 7 8c 

ctgctgtaca agaacat cat aggggcgatc gagcgtcaca gcctgaacgc gtatatcggt 34 0 

ttcctacacc agg att c ac g agggcacgca acgtctcgtg ccgaattcgg cacgagctcc 9cg 

gctgaaaccg gtggccggct jctcagtgcc tgtacgtaat zcsct3cgqc caggccggcc ?6c 

cgcc3gccga ataccagcag atcggacagc gaattggcgc ccagccggtt ggagccgtgc ioc j 

gcgtctagtt cgacacegcc catcacgtag tcacacgtcg gcccgacttc cattgcctgc 1i4c 
gttcggcacg ag zis2 
.2 ::jegrmat:c:: pgr 3 eg. :g nc:135. 

sez yte jc e cmaf a ct er z g t z g g . 

A, LENGTH . rase :a:r: 

3 TYPE. "u:le:: jc:a 

G GTEA*GGEZMEGG . oinc ? 

G ■ TO PC ZCGY . :-ear 

. . MCLECVLE TYPE. CNA acr.-:: 

SEQUENCE GEGCRI PTI 211 . GEC. IE ::P::J6: 




WO 99 42 118 
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TCCCGACGCT GGTCGCGGTT GCGCCGCGAA AGCGGCGGGT CGGGTGCCAT CAGGAATGCC 54 0 

-TCACGGCCGC GGCACTGCAC GGCCAGTGCC GCGGCGATGT CAGCCATCGG GACATGATGG 6 00 

TCGCGTTCAT ACTCCTCGAC CACTCGGCGC AACAGCTCGA TTCCGGGACC GCCCA 555 
v 2 ; INFORMATION FOR SEQ ID NO = 137: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 ammo acids 

( B ) TYPE: amino acid 

[O 3TRANDEDNESS : single 
(D! TOPOLOGY: linear 

n; MOLECL T I*E TYPE: peptide 

^ ^ JL.^O L^ll — _- -J t ~ - - ' •■• — s< ■ — , . 

Asn Ala Val Val Ala "he Ala '/a I Gly ?he Ala 5er Lev: A^a Val 



a Val Ala Val Thr TV- Aru Pro Tar Ala Ala 3er Lys Pre Val Glu 

20 25 30 

ily His Gin As:: Ala 31:: Pro Gly Lys Phe Met: Pro Leu Leu Pro Thr 

35 40 45 

;ir. Gl:: Gin Ala Pro Val Pro Pro Pro Pro Pro Asp Asp Pro Thr Ala 
5 0 5 5 5 0 

^h- 3' r. 31".' 31 v Thr lie Pro Ala Val 3.:: Asn Val Val Pro Aro 



Va. Va^ Pro ^_a : J r: /a^ .- j ro 
105 110 



WO 99/42118 
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195 230 205 

Ala Thr Ala Thr Pro Thr Thr Val Ala Pro Gin Pro Thr Gin Gin Pro 
210 215 220 

Thr Gin Gin Pre Thr Gin Gin Met Pro Thr Gin Gin Gin Thr Val Ala 
225 23G 235 240 

Pro Gin Thr v^al Ala Fro Ala Pro Gin Pro Pro Ser Gly Gly Arg Asn 
245 250 255 

Gly Ser Gly Gly Gly Asp Leu Phe Gly Gly Phe 
260 265 

INFORMATION FOR 3EQ ID NO: 13 3: 

i ; 2 EQUHMCE JHARACTH!?. I ST I OS : 

Ai LENGTH: 174 amine acids 

B ■ POPE: itniac acid 
; 0 : 3 TRAND EDNE S S : single 
<L l TOPOLOGY: linear 

,ii; MOLECULE TYPE: peptide 

xi ; SEQUENCE DESCRIPTION: SEC 10 MO: 13 9: 

lie Asn Sir, Pro Leu Ala ?r': Pro Ala Pre Pro Asp Pre; Pro Se; Pre 

Pro Ai'3 -ro Pio Val Pro ?u Val Pro Pro Leu Pro Pro Ser Pro Pro 

zz s so 

3e: ?: ; Pr.o I::; G:y Pro Val Pro A: o Ala Leu Leu Pro Pro Trp Leu 

: = -v 1 -4 5 

A^a Gly Thr Pro Pro A^a - r o Pro Va! Pro Pro Meo A. a Pro Leu Pro 
3 0 3 5 ■", C 

-ro Ala A 1 a P r o Leo Pro Pr~ Leo Po: Pr z Leu Pro Pro >'j ?r~ Thr 



A1.J Gv:.: P r j Pne Val Pro Val. ?r: Pr~ A. a Pro Pro Leu Pro Pro Ser 



Pro A^a Aoo A.j A. i Ovs Pro Pro A * a Pro ?r- 



WO 99/-J2118 
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Leu Pro Asp Asp Thr Thr Ala Arg Gly Cys Arg Arg Thr Gly 

165 170 

INFORMATION FOR SEQ ID NO : 1 3 9 : 

U: SEQUENCE CHARACTERISTICS: 

{ A) LENGTH: 35 amino acids 
(BJ TYPE: amino acia 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION: SEC 10 NO : 13 9: 
Sir. Pre Pro Ala 31 -j Val Ser Asn 31- '/a*. hp»- "lv '»n ' 



Ala val Gin Pro 3er Pro Arg Thr Thr Ala Glu Asp Pro Arg Pro A 
20 25 30 

Asr. Arg Arg 
35 

INFORMATION FOR SEQ 10 NO: 14 0: 

: SEQUENCE CHARACTERISTICS : 

;a; LENGTH: 104 ammo aciao 
'3) TYPE, amino acid 
G GTRANDEONESS : 3mq_e 
rOPCGOGV: .mear 



IENCZ GESIRIPTION 



Asc Ser J.- 



WO 99 42118 
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Gly Gin Leu Arg Arg Gin Phe Tyr 

100 

INFORMATION FOR SEQ ID NO: 141: 

,1) SEQUENCE CHARACTER I ST ICS : 

(A) LENGTH : 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : dingle 
CD) TOPOLOGY: linear 

;.i? MOLECULE TYPE: other nucleic acid 

(AJ DESCRIPTION : /desc -- "PGR primer" 

' v i ; OR I GINAL SOURCE : 

A ) 0 RG AN ISM: My c ob a cter: urr. *; ub e r c 1 o s : s 

x: ■ sequence description: sec :d nc : u:: 

GGATCCATAT GGGCCATCAT CACCATCATC ACGTGATCGA CATCATCOGG AC 

INFO RjMAT ION FOR SEC IC NO: 142: 

i SEQUENCE CHARACTERISTICS 

(A) LENGTH: 4 2 base pairs 
(Bi TYPE: nucleic acid 
: C ' 3 TPAND EDNESS : single 
■ID: TOPOLOGY: linear 

:i MOLECULE TYPE: otner nucleic acid 

■A. DESCRIPTION: acsc ■- "PCR Primer" 



" : ;rce . 

.CM Y!y :c^ac: o r urr. :uog::u1qg;3 



iEQUENCE CES0RIPT1CN . 



JSJ. _vJ^ 



^FAMOEDNESS ■ sir 
" P " ".00 V .near 



WO 1 1 S 



I'CT A'S'W '{1326? 



'GCA AAAC CACOGAGCGG T 

'2 INFORMATION FOR SEQ ID NO : 1 4 4 : 

Ii) SEQUENCE CHARACTERISTICS: 
(AJ LENGTH: 31 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

In) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - " PCR primer" 

;vij ORIGINAL SOURCE: 

a; ORGANISM? Mycobacterium tuberculosis 

"TrioAATTv. . vjGAA ATCGTCGCGA T 

2 INFORMATION FOR SEC 10 NO: 14 5: 

■ : ■ SEQUENCE CHARACTERISTICS : 
\A) LENGTH: 3 3 base pairs 
.3 TYPE: nucleic acic 
;C: STRANDEDNESS: jmyie 
.0 TOPOLOGY: linear 

ii MOLECULE TYPE, ether nut lei c acid 

'A DESCRIPTION: ; desc - 'PGR primer" 

:r:ginal source. 

,\ ORGANISM. Mvcccact-nu- :^er:u.:s:.] 



. ' — awCC ^ ^ uAtjnl^AA 3ACCGATGCC SOT 



:: t;?:LOGV ..near 



WO 99/421 18 
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9S 



INFORMATION TOR SEQ ID NO: 14": 

i ; SEQUENCE CHARACTERISTICS : 

(A; LENGTH: 1993 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS. single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

vi : ORIGINAL SOURCE : 

A ORGANISM: Mycobacterium tuberculosis 

IX] FEATURE : 

,AJ NAME / KEY : CDS 
'_, ,> ^3' T,T nM- - c -> 



nce lescripticn: sec zd nc • • 

CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCC 

A3CAT3C3GA AACC3CCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCW 

3CGG AAATTG AAGAGCACAG AAAGGTATGG " 3TG .AAA ATT CGT TTG CAT ACG 

Va I Lys lie Ara Leu His Thr 



an 



-a-, .c; Thr Aia 



Leu Le 



G OTA 
u Leu 



Ala Ala 






wo wuiis 



% 
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CTG ATG AAC ATC GCG CTA GCC ATC TCC GCT GAG 'GAG GTG AAC TAC AAC 5 56 

Leu Vez Asn lie Ala Leu Ala He 3er Ala Gin Gin Val Asn Tvr Asn 
12C 1-5 :3 0 * 135 



GTG GCC GGA GTG AGG GAG GAC GTC AAG CTG AAG GGA AAA GTG CTG GGG 
Leu Pro Gly Vai Ser Glu His Leu Lys Leu Asn Gly Lys Val Leu Ala 
14 ° 145 150 



Ser Lys Gin Asp Pro Glu Gl 
20C 205 



hr Val Asp Phe Pre Aid Val 



j.- r Men , a_ Thr !iv 2vn Ala G-u * 



:vs Val Ala 



604 



GCG ATG TAC GAG GGG ACC ATC AAA ACC TGG GAC GAC CCG CAG ATC GCT 652 

Ala Met Tyr Gin Gly Thr lie Lys Tiir Trp Asp Asp Pro Gin He Ala 

155 160 165 

JGG GTC AAC GCC GGC GTG AAC CTG CCC GGC ACC GGG GTA GTT CCG CT3 7 0 C 

Ala Leu Asn Pro Gly Vai Asn Leu Pro Gly Thr Ala Vai Val Pro Leu 

:7 ° 17b :ao 

Arn; Ger Asp Gly Ser Gly Asp Thr Phe Leu Phe Thr Hl.n Tvr Leu 
18S 190 :9S 



Ljtj^ /^^r TTC GGC ACC 79£ 

Gly Lys Ser Pro Gly Phe Glv Thr 



. 0^.-0 CTG GGT GAG AAC GGC AAC 344 
Gly Ala Leu Gly Glu Asn Gly Asn 
-25 230 



jiu - - UCC GAG ACA ICG GGC ^GC GTG GCC TAT 8 92 



3G2 



9AR 




wo wj:! is 
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330 33:: 



340 



AAC AAG GCC TCG TTC CTC GAC CAG GTT CAT TTC CAG CCG CTG CCG CCC 
Ash Lys Ala Ser Phe Leu Asp Gin Vai His Phe Gin Pro Leu Pre Pro 
345 350 355 

GCG GTG GTG AAG TTG TCT GAC GCG TTG ATC GCG ACG ATT TCC AGC 
Ala Val Val Lys Leu Ser Asp Ala Leu He Ala Thr He Ser Ser 
360 365 370 



1228 



1273 



1333 
1393 



163 3 
1693 
1-753 



TAGCCTCGTT GACCACCACG CGACAGCAAC CTCCGTCGGG CCATCGGGCT CCTTTCCGGA 

GCATGCTGGC CCGTGCCGGT GAAGTCGGCC GCGCTGGCCC GGCCATCCGG TGGTTGGGTG 
GGATAGGTGC GG7GATCCCG CTGCTTGCGC TGGTCTTGGT GCTGGTGGTG CTCGTCATCG 14 5 3 

CAGGCAACAG GTACGGCGAA ACCGTTGTCA ^CGACGCGTG GCCCATCCCG TCGGCGCCTA 1 E " 2 

CTACGGGGCG TTGCCGCTGA TCGTGGGGAC GCTGGCGACC TCGGCAATCG C C CTG A T C A T 
CGCGGTGCCG GTCTCTGTAG GAGCGGCGCT GGTGATCGTG GAACGGCTGC CGAAACGGTT 
GGCCGAGGCT GTGGGAATAG TCGTGGAATT GCTCGCCGGA ATCCCCAGCG TGGTCGTCGG 
TTTGTGGGGG GCAATGACGT ZGGGGCZGTT CATC 3 CTG AT CACATCGCTC GGGTGATCGG 1313 

ITA^—GGGC GGCGACCCGG GCAACGGGGA 19^2 

'ACT GAT GAC 2TGTTCCCGC AGGTGCCGGT 2-TTGCCCEGG GAGGGCGCGA ^CGGGAACT" 

:. ::;p~?mat::;; ?cr sec :c :;c.;4d: 

i dHC'JENCE CHARACTSRIGT'CG 

A LENGTH : 3 ? 4 ammo acids 
^ TYPE: ir— 3 c:i 
: 2CPGL2GV . — v^r 

^.C-CJ- - ;j ■ ; ; ? T I 2 r: C V ; , , 

Ar ~ Le ' J " 1J : '" r - e - -eu A - a Val Leu T::: A. a A J. a Pr., 



: ? 3 3 



'rr Prr ]. 



WO 99 42118 
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1 >N 



50 



Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg TVr Pro Asn Val Thr 

He Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly lie Ala Gin Ala Ala 
35 go 95 

Ala Gly Thr Val Asn lie Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 110 

Asp Met: .Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala lie Se- 

115 -0 125 

Ala Gin Gin Val Asn Tyr Asn Lev; Pro Glv Val Ser Glu His Leu Lys 
~ 30 135 140 



leu Asn Gly Lys 7a i Leu Ala Ala Met Tvr 
150 



:hr lie Lvs Thr 



Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pre Gly Val Asn Leu Pre 



17 G 



175 



^-y ...r Val Val Pro Leu His Arg Ser Asp Glv Ser Glv Asn Thr 

190 135 i 9 o * 



Phe Leu Phe 
19 5 



rr ^eu Ser Lys 



? r o G 1 u 
205 



y Phe Gl\ 



/al Asp Phe Pro Ala Val Pro Gl\ 




a:,-* As:: 



•A- a Ser Met ;i»j Aoo Glv 



Pro 



*y r A .a : \<i Val Asn 



WO 99M21IS 



99 



!>CT I S^' "03265 



His ?he Gin Pro Leu Pro Pro Ala 7a 1 7a; Lys Leu 3er Asp Ala Leu 
355 360 365 

He Ala Th.r He Ser 3er 
370 

(2) INFORMATION FOR SEQ ID NO: 149: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 
! C) STRANDED NESS : Single 
!D) TOPOLOGY- linear 

: SEQUENCE DESCRIPTION: SEQ XL : 14 y : 



. i. G - Gvj ^ _ T A T A T G OG C A ~ 
j COO A/-. 7 T A J 3 G AAT AG GT ' 



AACCGCCCGA TACGTCGCGG GACTGTGGGG GGACGTCAAG GACGCCAAGC 12 0 

jCjG^T.G A^GAw-CACAG AAAGGTATGG CGTOAAAATT CGTTTG2ATA CGGTGTTGGG I3C 

CGTGTTGACC GCTGCGCCGC TGCTGCTAGC \GCGGCGGGC TGTGGCTCGA AACCACCGAG 24 0 

GGGTTCGCCC GAAACGGGCG 0C0GG0CC0 G TACTGTCGCG ACTACCCCCG 2GTGGTCGCC 3 00 

3GC0ACGTTG GCGGAGACCG GTAGGACGCT GCTCTACCCG CTGTTCAACC TGTGGGGTCC 3 6 C 

300CTTTCAC GAG AC G TAT G C0AAC3TCA2 GAT GACCGCT CAGGGCACCG GTTCTGGTGC 4 2C 

_ G-jGrtTCSCG J.-lG C C 0 3 0 G 3 :OGGGACGGT GAAOATTGGG 3CCTCCGACG GGTATGTGTG 4 8C 

^ ^r.- vj- vu w ^ ^ . ~ ~j ^- — w »_ w ^. ;\_-uvj juuv. . 3 AT 3AACATC 3C3CTAGCGA "TCTCOGGTCA : : 4 .J 

. A _ 1 _ — ■->- -LoGAGTGAG 3GAGCACCTC AAGCTGAACO 1AAAAGTCCT n C 0 

; j..:j-^Tj -AwOAGGGGA OCATOAAAAG 3TGGGACGAC GC3CAGACC0 GTGCGCTCAA -560 



. w r \ o JAM v_ J 



WO 99 42 1 IS 
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100 



GACCTTGCAG GCATTTCTGC ACTGGGCGAT CACCGACGGC AACAAGGCCT GG7TCCTCGA 
GGAGGTTCAr nCCAGCCGC TGCCGCCCGC GGTGGTGAAG TTGTCTGACG CGTTGATCGC 
GACGATTTCC AGCTAGCCTC GTTGACCACC ACGC3ACAGC AACCTCCGTC GGGCCATCGG 
C-CTGCTTTGC GGAGCATGCT GGCCCGTGCC GGTGAAGTCG GCCGCGCTX^ CCCGGCCATC 
COGTGGTTGG GTGGGATAGG TGCGGTGATC CCGCTGCTTG CGCTGGTCTT GGTGCTGGTG 
GTGCTGGTCA TCGAGGCGAT GGGTGCGATC AGGCTCAACG GGTTGCATTT CTTCACCGCC 
ACCGAA7GGA ATCCAGGCAA GACCTACGGG GAAAC CGTTG TGACCGACGG GTGGCGCATC 
CGGTGGGCGC CT ACTA CGGG GCGTTGCCGC TGATC3TCGG GACGCTGGCG ACCTCGGCAA 



..... w.TCGGGGTG GGGG7GTCTG .Auu^Cjli^ JLTGGTGATC GTGGAACGGC 



TGCGGAAACG GTTGGCGGAG G C TG TG GGAA TAGTCGTC 



GGAATCCGCA 



GC073GTCGT CGGTTTGTGG GGGGCAATGA CGTTGGGGCC GTTCATCGC7 GATCACATGG 
CTCCGGTGAT CGCTCACAAC GCTCGCGATG 7GCG3G7GCT GAACTACTTG CGCGGCGACC 
C3GGCAACGG GGAGGGCATG 77GGTG7CC3 GTCTGGTGTT GGCGCTGA7G 0TCG7TCCCA 
TTATCGCCAC CACCACTCA7 GACGTGTTCT GGCAGGTGCG GGTGTTGCCC CGGGAGGGCG 

'JJATGSGGAA ttc 

::;fcr>tat;on for gec zz ::c.i5C: 

A. LENGTH: =i~inc 

o TYPE; amino jcic: 

'J. 1 STRANDEDIfESS : 

3. 7C?C-L0GY: linear 

=ECb'L\\'CE EESC~ 1 r~T . v ~ - r 

; *'.-'"* • -eu 



- - 3 C 

^ r - - A ^ -V ^ '-a. A i a ?r-. Ala ,er 



1200 

12S0 

1320 

1380 

1440 

1500 

156C 

1520 

158C 

; "4c 
laoo 
i a 6 c 

192 0 
198C 
1393 



WO 99/421 IS 



ITT I S l ) ( ) 03265 



lie Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Aid Gin Ala Ala 
85 go g5 

Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tvr leu Ser Glu G 1 - 
10C 105 110 

Aap Met Ala Ala His Lys Gly Leu Met Asn lie Ala Leu Ala lie Ser 
115 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 135 i 40 

Leu Asn Gly Lys Va I Leu Ala Ala Met: Tvr Gin Glv Thr He Lys 

145 I 50 1S5 * * 1 GO 

Trp Asp Asp Pro Gin lie Ala Ala Leu Asn Pro Gly Val Asn Leu Pre 

1^ ^ . u i- $ 

Glv Thr Ala Val Val Pro Leu H;s Arg Ser Asn Glv Ser Glv Asn 
^3C 135 190 



"-95 2G0 



Asp Pre Glu Gly Trc G: 
205 



lys Ser Pro Glv ?he Gly Thr Thr Va ■ Asp Phe Pro Ala Val Pro G 1 

215 220 

^ a LSU Jiy 31u Asa 31 y Met: Val Thr Glv Tvs Ala Glu 

=30 235 - ' 



1 4 5 



a: ,wj Tyr _e Hv lie Ser Phe Leu Asp Sin Aid 

15 0 -> c r. 



Phe Lei, 



^ e r 

: bo 



.a j. d .a ^ d 




WO 99 '42 11 K 



]().] 
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3?0 

• 2] INFORMATION FOR 3EQ 13 NO: 151: 

,i SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1777 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

ixi! SEQUENCE DESCRIPTION: SEQ ID ?JO -. 1 5 1 : 

GGTCTTGAGG ACCACCTGGG TGTCGAAGTC GGTGCCCGGA TTGAAGTCCA GGTACTGGTG 6 0 

GGTGGGGCGG GCGAAACAAT AGCGACAAGC ATGCGAGCAG CCGCGGTAGC CjTTGACGGT 12 C 

3 TAG ~GAAAC GGCAACGCGG CCGCGITG3G CACCTTGTTC AGCGCTGATT T G G A C AA CA C ISO 

— ^ C3 3vjnAb jTGATj^v-Gi artAl'I G T jG CGv„ GC j.-\nC3 CTGGGGAC ^ j — >-»r\ * ~ ■ — * _4C 

ZTGCAACCGG GCAGCGGGCG TCGTCAACCG GCATCCCGTT CA C C G C G AC G G C1TG CCGGG 3 00 

G C C AA C G CA T ACCACTATTC GAACAACCGT TCTATA ITTT GTCAACGCTG GCCGCTACCG 3 6 

AGCGCCGCAC AGGATGTGAT ATGCCATCTG TGCC CGCACA GACAGGAGCC AGGCCTTATG 42 0 

A GAG ■ GAT T C G GCCjTCCAGCC CTACGGGCAG CCGAAGTACC TAGAAATCGC CGGGAAGCGC 48 0 

ATGG 2CTATA TCGACGAAGG CAACGGTGAC GC2ATC3TCT TTCAGCACGG C AA G C C CA C j d4Q 

_ GT 7TT^\ _ T . uTGu „3CArt _AT _i vToC- j CA2TTG3AAG GG GTGGGG C j_ ^GG iOvjC- nOO 

. JGG^-C — ->\ ~ _GGG--\T.jvj^j „GC - * CGGr\C AAGCT7AG2C GATGGGGACC j/iC^Ov — A- doc 



a!joU\ - jnu/v o .... '._ . . . j'ju , -v j. _j *jv. *. G - ^ o O — 



\v_ ^jTG G ^ A'_ . GGTG _ TG L /\ ^ vjA _ T Gvj'oGC TCGGCCCT7G GCTTCGACTG j-jCIAAJ A^ HO 



rr"';A-";GA^ 



WO 99/42 118 



PCT.'l'S9<) '03265 



J 03 



jACGACG c g g igcggtgct 



3 . . iMMl JMi 



j ^ J/-*. iri^ - o^LaATT GAGA 



7 RAN iJ- EG N EGG 3 : 
"PTGGGV linear 



'JEN- 



1380 

1 4 4 G 

15 0C 
155C 



1680 
:^4G 



UACCAAGAAT GTGATTT GGG GCCAAGGCGG CGCGG7GGTZ GTCAACTCAT AAGACTTCCT 
GCTCCGGGCA GAGATTGTCA GGGAAAAGGG CACCAATCGC AGCCGCTTCG TTGGCAAGGA 
GGTCGACAAA TATACGTGGC AGGACAAAGG TCTTCCTATT TGCCCAGCGA ATTAGTCGCT 
GCCTTTCTAT GGGCTCAGTT GGAGGAAGCC GAGCGGATGA CGCGTATCCG ATTGGAGCTA 
TGGAACCGGT ATCATGAAAG CTTCGAATCA TTGGAACAGG GGGGGCTCCT GCGCCGTCGG 162 0 
ATGATCGGAG AGGGCTGCTC TCACAACSCC CACATGTACT ACGTGTTACT \GCGGCCAGC 
GGGGATCGGG AGGAGGTG CT GGCGCGTCTG ACGAGCGAAG GTATAGGCGG GGTCTTTCAT 
TACGTGCCGC TTCACGATTC GGCGGCGGGG GGTCGCT 

2 information ?or sec :g ng-is:: 

;i; SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 324 base pairs 
( E 1 TYPE: nucleic acid 
■ C ■ STRAND EE NES 5 : 5 1 na I e 
' D ) TO PC LOG Y : 1 1 ~ e a r 

XI SEQUENCE DESCRIPTION : SEQ IE NO : 1 5 2 - 

GAG AT TG AA T GGTACCGGTC TCCTTAGCGG CTCC3TCCCG TGAATGCCCA CAT OA CG C AC 

^ GTTG TGGCTGTCGA 'ZG^ZGCGGG ATGCOCGGAC GTTGGTAAAC OCAGGGTTTG 

■w_AJ.AA.: IE jGGGGAGG ITTGCGGGAA JGCGGCCAGG ATGTGCGTGA "5CCG2GGCG" 



lAACGAOT ICI 



WO 99/421 IS 



10-1 



-CACOACTTC GAGCGG3AGT CGATGGGCGT GCTGACGCGT GCTGTCGCTA TGGCTGCGTG 24 0 

3GAAGCTCGC GTTCGGAAGC GATTTCCGTT CCTCACTGAC GTCGACGCGG ACGAGCAGCG 3 00 

GTGGGCGGCG TGCGAGGAAC GGCACCGCCG GGAAGTGGAG AACGCGCTGG ZGGTGCTGCG 3 60 

GTC CTG ATCA ACGTGCCGGC GATCGTGCCG TTCGGCTGGC ACGGTTGCGG CTGGACGCGG 4 20 

CTGAATCGAC TAGATGAGAG CAGTTGGGCA CGAATCCGOC TGTGGTGGTG AGCAAGACAC 4 80 

GAGTACTGTC ATCACTATTG GATGCACTGG ATGACCGGCC TGATTCAGCA GGACCAATGG 54 0 

AACTGCCCGG GGCAAAACGT CTCGGAGATG ATCGGCGTCC CCTCGGAACG GTGCGGTGCT 5 00 

GGCGTCATTG GGACATCGGT CC3GCTCGCG GGATCGTGGT GACGCCAGCG CTGAAGGAGT €60 

ggaggggggg gg~~-- ----- "tgctggagg jgggggagag ggtgg^gc'G ~~7AAGGGCG 'u 

JGATGGGCGA GAAGGGCTTG 3AGG7G3CGG JGGACGAGTT GTTGTTG7TG GGGAGGGTG3 ~8C 

G3CACAGCCA CGCCGAGCGG GTTCGC2CGG AGCACCGCGA GCTGCTGGGG CGGGCGGCCG 34 0 

rCGACAGCAC CGACGAGTG7 GTGCTACTGG GGGZCGZAGZ GAAAGTTGTT GCCGCAC7GC 900 

G3GT7AAGG3 3GGAGAGGG7 CT3GAC3CGA 7 3 3AGGA7CT GGACATG7GG ACGGGGGAGT 960 

GGGCGA7G33 CG733G3GAG GC3G7GGGGG 7GGC3CG7AG GCCGGAGTA" GGCGGTTGCA 1080 

7GAGG73G37 3GAG373C3G 3T3ACGGG3A 33773G33GG GGGGG*"3GAG GATGAGGGCG :i40 



. : 3 :„ L 7-iAi'A .'T^r I J 7 ; 
^ _Z!: _;7:i • : ; :as^ ,1 ; 

g GT?A:;^L;rrii£GG s:--.e 



WO 99/42118 



PCTTSW03265 



I 05 



GZAGGGGCGG TGCCGGCGGC GC2GCCAAC0 ACGCCGGCAG CACGGGCAAT 2CGGGCGGTA 

AGGGCGGCGA CGGCGGGATG GGCGGTGCCG GGGGGGCCGG GGGGGCGGGC GGCACCGGCA 

AGGGCGGCGA TGCCGGCAAC C 

(2) INFORMATION FOR SEQ ID NO: 155: 

<l) SEQUENCE CHARACTERISTICS : 

CA} LENGTH: 492 base pairs 
CB) TYPE: nucleic acid 
(C) STRAND EDN~ESS : single 
( D > TOPOLOGY: linear 

xi SEQUENCE DESCRIPTION: SEQ ID NO: 15 5: 

2-AAGACG0GG 2GCGGGCA7A ""CGAT0G0C7 -TGGCGACTAG 

3CGGCG7CGG GCTGA7CA7C AC 




24C 

30G 

3 2; 



:GGTCGACGA 7TCGGG7GCA AAGATGCTGC 7GCAAA7™ GCACGCCGGA 2G77A2GCC7 24 c 

mgacoga27 TGCSGTCAGC GCGTGGCGGA 72AAGGCGC7 GA7CACCGCG tttggtgcgc jOC 
HAGCACTATC 0GC77GCGGG G72GAAGGGA 2CA72GCGGA 7TTCGCCC3C 70CGC2CAG7 360 

4 2 C 

480 

■GGCGG .57 4g ., 

:;:- v jrma7:o:j :'cr sec id :r; ; 

: SEQUENCE 2HARAC7ER 1 272 72 

A 1 i^ength : 1:1-3 im^z ic:as 

= :VP~: irr\;;— , .; ; ; 

; sr:-- a;jded::ess 

" r ' L.J G": . .r.'ji : 
>:*:-2; ; ; ; ; - :: - . . 

1 r ' : ; '■' 1 ' - ; - : Vi- 2^ "rz Ar- A _i 



WO 99 42 IIS 



106 



PC T I N^> 03265 



50 

Ser Ala Ala Gin 
65 

Pro Gly Leu Mer 



Tyr Lea Glu lie 
100 

Gly Asp Ala lie 

115 

Trp Arg Asn lie 
13 0 

Zy 5 Asp Leu He 
14 5 

Pre Asp Arg Tyr 

Trp Asp Ala Leu 
13C 

Trp Gly Ser Ala 
195 




'."al Leu -ro Jlv 



55 

Asp Val He Cys 

Thr Ala Phe Gly 
as 

Ala Gly Lys Arg 

Val Phe Gin His 
120 

Mer. Pro His Leu 
:35 




60 

His Leu Cys Pro 
5 

Val Glu Pro Tyr 
90 

Met: Ala Tyr He 

105 

Gly Asn Pro Thr 

Glu Gly Leu Gly 
140 

Jer Asp Lys Leu 

15 5 

Gin Arg Asp Phe 

His val Val Leu 
135 

Trp Ala Asn Gin 

Ala 11- Val Thr 




His Arg Gin Glu 
80 

Gly Gin Pro Lys 
95 

Asp Glu Gly Lys 
110 

Ser Ser Tyr Leu 

125 

Arg Leu Val Ala 




Leu Phe Ala Leu 



Val Leu His Asp 

190 

His Ar^ Asp Arg 
205 

pro Mer Thr T^v 




aj: Me: 



WO 99/42118 



10 
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Pro Gly Val His ?he Val Sin Glu Asp Ser Asp Gly 7a; Val 3er 
355 36C * 3^5 



Trp 



Ala Gly Ala Arg Gin His Arg Arg Pro Glv Ser Ala :eu lie Ser Arc 
370 375 ' 380 

.Asp Gin Clu Cy-3 Asp Phe Arg Arg Arg Arg Arg Pro Ala Cys Gin Leu 
385 390 395 400 



He Arg Leu Pro Ala Pro Gly Arg Asp Ser Gin Gly Lys 



Gly His Gin 



405 410 415 



Ser Gin Pro .eu Pro Ser Gin Arg Gly Arg 31:: lie Tyr Val Ala Glv 
420 425 430 

Arg " er 1Vr - eu ?r ~ Ser Glu Leu Val Ala Ala Phe Leu ~rp 



445 



la Gin Phe Glu Glu Ala Glu Arq He Thr Arg He Ar, Leu Aso leu 
450 455 460 



Trp Asn Arg Tyr His Glu Ser ?he Glu Ser Leu Glu Gin Arg Glv Leu 

4^5 " 48C 



455 47Q 



-eu Arg Arg Pro ;> :i e Pro Glr. Gly Cys Ser His 



*yr Tvr Val 



Asn Ala H:s Met 
485 -190 43^ 

Leu Leu Ala Pr~ Ser Ala Asc Ara Glu Glu Val Leu Ala 



Leu Thr Ser Slu Gly 1> llv A. a / a . Phe Hiu Tyr Val Iro leu 



1 



Hio Aso Ser Pro Ala Glv Arg Ar~ 

i:;fcrmat:o!j fop. sec :l mc ■. 



:;ecle^;ce :ha^a step : sty ::y 
a „s^;gth ■ .:^4 ir.— . . 

S TY PE arr.i :;u _ _: 

: st^ai;lll>;ess . 

"Z P 1LSSV _ :nud: 

^ eeyje:; yz eesgp:pt:on je: i;s . ; - ^ 

ilu Ser All ? r - Ars Ser ~ — y~ ■ 



WO 99 42 \ 18 
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ION 



G^y Asp Asp Arg .Ala 31 
50 

Gly Phe Leu Glu Pro Ala ?r 
Gly Glv Leu Thr Val As 



eu Gly Val Asp Glu Gin Phe Arg His Vd ' 
50 55 60 



Val Leu Val Asp Gin Arg Asd Asd Leu 
7 5 ' 30 



85 



p Trp Lys Val Ser Trp Pro Arg Gin Arg Gly 



90 



95 



Ala 



Thr Val Leu Ala Ala Val His Glu 



100 



105 



"rp Pro Pro He Val Val Hi 
110 



Phe Leu Val Ala Glu Leu Ger Gin 
115 



Asp Arg Pro Glv Gin His Pro Phe 
120 ins 



-^su . a^ . w cj 

1 3 0 



-eu .-ua Leu Arg Arg Jcr 
14 0 



- rils ^rg -^-3 — -a- Arc; Pre ^rcr 

15 0 - q - 

- 7 160 



j-y Aso Asd Arc; ? 



?ne His Glu Arg Asp Pr 
65 - 



Lieu mi s Ser Val 



3 Mp *- ' p\ 



• Ser Pro 



-a. Val Ala g: 



G_u A; g Arg Ala Pre Val Val 

Val 3lu Arg He Pre; Glu Arg 
205 



w a _ . e .n^ a Vai 



Ala 31-. H: 



: s v 



WO 99 421 IS PC T 1S99 0326? 



A.jArtL.rtT.n -oTVGGTGGT GGG TCGCAAG GCGTTTG 

GGGA7GGAGG CGATCGGGGG T7TGTGGGAT GGGTTGCGGG AAGAGCTGCG GGGTAGCGGA 12 j 

ATGGGGGTGT GGGTGATGGA GGGGGG3CTG ACGGAGACAC CGCTGTTGGC GAACGTCGAC ISC 

GGGGCGGACA TGCCGCCGGC G7TTCGGAGC CTCACGGGGA TTCCCGTTCA G7GGGTCGGG 24 C 

GGAGCGGTGG TTGACGGTGT GGCG 2 64 

,2 INFORMATION FOR JEC ID ^0:159: 

. i SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1171 base pairs 
>'3) TYPE: nucleic acid 
■C JT?J\NEEENESS : 3inale 
D: TOPCLC3Y : linear 

xi SEQUENCE EEGCRIPTION: 3EC IE :;C:153: 

;aj;gjgg3a cgatgacjt:; gcggtccagg ccgaccgctt gaagcaggag gg ggaggacg e: 

AAGCC3GT3C GATCCTTACC CGCGAAGCAG TGGGTGAGCA CCGGGCGTCC GGCGGCAAGC 12 0 

_ . - - - - - - — - j * - . -^_^^^j^^A . . _A — oACT GGCTGGATTC GCCGGACT7G 040 

' ' .-j^.-.-l. . :>j . -.-i'j^.^i _ .A JrtA I GGGG TTTCGTGCGG EG CTGAGTOG 30 J 



■v-J ^ . . IVj J . „ 



WO 99/421 18 



PCT TS99 03265 



- -: J ■ — ..... j ; 



: ^ E I L EN J li " ! {Alt. A C T E ?. I 3 T 1 IS 



GGCACCACCG tcggttcgca cgtacggacc 3GGTCCGACA CCATGTTC3T GGCGCCAGTA I08C 

ACCATCGGCG ACGGCGCGTA TACCGGGGCC GGCACAGTGG TGCGGGAGGA TGTCGCGCCG 1140 

GGGGCGCTGG GAGTGTCGGC GGGTCCGCAA C I 171 

;z: INFORMATION FOR SEC ID NO: 16 0: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 227 base pairs 
{3) TYPE: nucleic acid 
;C) STRAND EDNESS : single 
(D) TOPOLOGY: linear 

xi ; sequence: description: sec id NOiieo: 

JCAAAGGCGG CACCGGCGGG 3CCG3CATGA ACAGCCTCGA GCCGCTGCTA GCCCCCCAAC 6C 

AGGGGGGCGA AGGCGGCACC 3GCG3CACC3 3CGGCAACGC ZZZCCCZZZZ 3CCACCAGCT 12C 

TIACCCAAGG CGCCGACGGC AACGOCGGCA ACGGCGGTGA CGGCGGGGTG GCCGGCAACG 180 

3 "GGAAACGG CGGAAACGGC GCAGACAACA OGACCACCGG GGCCGCO 22" 

2 INFORMATION ~CR SEQ 13 NO: 151: 

i. SEQUENCE CHARACTERISTICS- 

;A) LENGTH: 3 04 Dase pairs 
(3) TYPE: nuclei: ac l .i 
:C) STRAND EDNESS : 5; rig is 
U TDPOL2GY . ::r.ea: 

sequence eesce : ?t : in : 3ec :c ::c 

ICCCCCCCAC 3GCTTCACTC CAACCAGGGG "GGCGACGGC 1GCGACGGCG '1 C A A C G G CG G ; I 

^. -Lin jlGoTCGGCG 3CAACG3CGO "GACGGCGGC \ATGGC0G0A ACGGCGGCAG "_3r. 



304 



WO 9Q.'42MS 



GTGGGACGCT GCGCAGGCTG TATAACAAGG ACAACATCGA CGAGCGCCGG CTCGGTGAGC o 0 
TGATCGACCT A TTT AA CA G T GCGCGGTTGA 3CGGGCAGGG CGAGCACCGG G 2CCGGGA7G i 2 Q 
TGATGGGTGA 3GTCTACSAA TACTTCCTCG GCAATTTCGC TCGCGCGGAA GGGAAGGGGG i30 

GTGGCGAGTT CTTTACCCCG CCCAGCGTGG TGAAGGTGAT CGTGGAGGTG CTGGAGCCGT 24 0 

CGAGTGGGCG GGTGTATGAC GCGTGGTGCG GTTCCGGAGG CATGTTTGTG CAGACCGAGA 30 0 

AGTTCATCTA GGAACACGAC GGCGATCGGA AGGATGTCTC GATCTATGGG 2AGGAAAGCA 16 0 

7TGAGGAGAC 2TGGCGGATG GCGAAGA7GA AGCTCGGGAC GGACGGCAT2 GACAACAAGG 42 0 

GGG~7GGCGC 7C3ATGGAGT GATACCTTCG GCCGCGACCA GCACGCGGAC GTGCAGAT3G 4 8 0 

^..j.^^l. , j. . . _ . — iu\Aj/K.o ju^^^GCAn'J GAGGAAGAC2 340 

2ACGCTGGGG ITTCGGTOTT CZSCZC'IZ^ ATAACGCCAA CTAGGCATGG ATT GAG GAGA -5 0 0 

:GTCGAACTC CAACGGGAAC GGGGA~A77C 3CGCGCAAAT GGTGGAG-3CG GA7TTGGTT7 720 

2 GT G CAT GGT GGGGTTAGGG ACGCAGGTGT 7GCGCAGCAG CGGAATC2C3 GTGTGGGTGT ^30 

3GT7TTTGG7 2AAAAA CAAG GCGGCAGGTA AGGAAGGGTG 7ATCAACC GG TG 2GGGCAGG 34 0 

; ^.^u.tcg" gaagtggggg ag7tag7gga 2ggggccgag ggggcgctga 90 0 

::g2GT2ggg "ggtgg^gg: ggtaatgggg :Ga;:tgg™ ~aa™cg:- jjgggtggtj : — 
\jggcgggaa 2gggggcaag 2gcggccagg "cggggacgg 2acgagggg:* 3gcgggggcg 



WO 99/421 18 



ITT TS99 (M265 



(D ] TOPOLOGY : linear 

xi SEQUENCE DESCRIPTION. SEC 10 NO : 1 6 3 : 

GGGCCGGCGG GGCCGGATTT TCTOGTGCCT TGATTGTCGC TGGGGATAAC GGCGGTGATG 6C 

GTGGTAACGG CGGGATGGGC GGGGCTGGCG GGGCTGGCGG CCCCGGCGGG GCCGGCGGCC 12 0 

TGATCAGCCT GCTGGGCGGC CAAGGCGCCG GCGGGGCCGG CGGGACCGGC GGGGCCGGCG 18 0 

GTGTTGGCGG TGACGGCGGG GCCGGCGGCC CCGGCAACCA GGCCTTCAAC GCAGGTGCCG 240 

3CGGGGCCGG CGGCCTGATC AGCCTGCTGG GCGGCCAAGG CCCCGGCGGG GCCGGCGGGA 30C 

- — ju^-^GuuC GGGCGGTGTT GJCGGTGAC 32 9 

- - -* ^ * v * ' ^ ^- 1 • - ) > _ C . m-j4 , 

SECTTENCE CHAPACTERI5TICS : 
,A) LENGTH: 3; oase pairs 
i3) TYPE: «uc: acici 

lD' TOPOLOGY: I: near 

>:i sequence osscriptton : seq ;i :jc:1o4: 

: CAACOGTGG CAACGGCGGC ACCAGCACGA OC^TGGGGAT GGCCGGAGGT AACTGTGGTG SC 



7H : 3 ?2 case ~a : : 
LOGY : 1 inear 



:;c . ; ■ 



W O 99 421 18 
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i: SEQUENCE CHARACTERISTICS ; 

(A; LENGTH : 535 base pairs 
t'B> TYPE: nucleic acid 
:c: STRAND EDNESS : single 
ID) TOPOLOGY: In 



(xi ) SEQUENCE DESCRIPTION ; SEQ ID NO: 166: 

ACCGGCGCCA CCGGCGGCAC CGGGTTCGCC GGTGCCCCCC GCGGGGCCGG CGGGCAGGGC 6 0 

3GTATCAGCG JTGCCGGCGG CACCAACGGC TCTGGTGGCG CTGGCGGCAC CGGCGGACAA 12 0 

0GCGGC3CC3 GGGGCGCTGG CGGGGCC3GC GCCGATAACC CCACCGGCAT CGGCGGCGCC 13 C 

3GCGGCACCG 3CGGCACCGG CGGAGCGGCC GGAGCCGGC3 GGGCCGGTGG CGCCATCGGT 240 

ACCGGCGGCA CC3GC3GCGC 3GTGGGCAGC JTC3GTAAC3 CCGGGATCGG CGGTACCGGC 200 

7GT.AC3GGTG 3TGTCGGTGG TGCTGGTGGT JCAGGTGCGG CTGCGGCCGC TGGCAGCAGC 3 60 

3 CTAC EG3T3 GCGCCG3 3TT CGCCGGCGGC 3CCG3CG3A0 AAG G C G G A C C GG3CGGCAAC A2Z 

AGCGGTOTGG GC3GCA3CAA C3GCTCCGGC IGCZCGGGC^ GTG CAGGCGG CAAGGGCGG C 4 30 

ACCGGAGGTG CCGGCGGGTC CGGCGCGGAC AACCCCACCG GTGCTGGTTT CGCCG 535 

Z INFORMATION FOR SEC ID NO:l~~: 

i 3 EQ HENCE CHARACTER I3TI 23 : 

A, LENGTH ; 3 9C oase cairs 
B TYPE, nucleic acic 
0 GTRANT EDNESG . r> ir.g . e 
C TOPOLOGY: linear 

:-:i SEQUENCE GESCRIPTIGN: CEO TO NO.!-". 

^CJACCTCCC OGGGGCG AT A 0GOGGGTCA0 03ACTACTAC ATCATCCGCA CC3AGAATCG -5 0 

, * o .„ » . - ^ ,j . , . j _ ^ i j j,-, , . ;..T3C- 7CJAC0TGAT \2 



WO 99/421 18 



PCTTSW 0326: 



I 1-1 



ATCAAC3CGA TCGGCTATCC CC7GGCGGCG ACGGTAGGTT TAGGCACGA7 GGATAGCGGG 

CGGCGTGGAA TTG CTCACCC TCCTCGCGGC GGCGTCGGAC ACCGTTCGAA ACATCGAGGG 

GCTCGTCACC TAACGGATTC GCGACGGCAT 

(2) INFORMATION FOR SEQ ID WO: 168: 

(i> SEQUENCE CHARACTERISTICS . 

(AJ LENGTH: 407 base pairs 

(B) TYPE : nucleic acid 

[C) STRANEEDNESS : 3inqle 
' D ) TOPOLOGY linear 

'.x:i SEQUENCE DESCRIPTION : SEQ ID NO: 168: 

AGGGTGACGG CGGTACTGGG GGCGGCCACG UCGGCAACGG G0GGAATCC7 GGG^GGC— 

7GGGCACAGC CGGGGG7GGG GG CAACGGTG 3CCCCCCCAG CACCGGTACT GGAGGTGGCG 

3GTCTGGGGG ZACCGGZGGC 3 ACGG CGGGA CCGGCGGCCG TGGCGGCGTG TTAATGGGCG 

ZZGGCGCGGG CGGGCACGGT G G CA CTGGC G GCGCGGGCGG 7GCC3GTCTC GACGGTGGGG 

3G3CGGGCGG GGGGGGCGGG 3CCGGCGGCA AC3GCGGCGC CGGGGGTCAA OCCGCCCTGC 

TGTTCGGGCG GGGCGGCACG GGGGGAGCGG GGGGCTACGG GGGCGATGGC GG7GGCGGCG 

GTGACGGCTT CGACGGCAGG ATGGCCGGC7 TGGGTGGTAG CGGTGGC 

:: :::fcrwat:on ?z? geo :d no::^?- 

. SEQUENCE CHARACTERISTICS. 

A LENGTH: 463 oase ;;ri;rr 
3 TYPE . r.-c;s:: jciu 
G GTEANDEDNESS . siaa- 
:D: TOPOLOGY, linear 

SEQUENCE CSGCP. 1 PTION ; GEO ' I \M ; ■ 



:GGCGGGC "CGAGGCGA ACACGTCGG7 GTCACCGGTG rAGATGGGGG 

AGCACGGGCC GC 3 C3GCGAA 3G7GOCCCC3 JTCGCCAGTA 
J " • v '~ ' * ~ ~ J A - ■ : ■ j . ; j . C.-\ G 3 '. - - jG 731 GAG 



60C 
690 



13 0 
240 
3 00 
360 
4C~ 
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2 information for seq id r:o:i^o- 

'!} SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 219 base pairs 

( B ) TYPE : nucleic: acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

GGTGGTAACG GCGGCCAGGG TGGCATCGGC GGCGCCGGCG AGAGAGGCGC CGACGGCGCC 60 

GGCCCCAATG CTAACGGCGG AAACGGCGAG AACGGCGGTA GCGGTGGTAA CGGTGGCGAC 12 0 

GGCGGCGCCG GCGGCAATGG CGGCGC^CGC GG CAACCCGC AGGCGGCCGG GTACACCGAC 180 

jGCGCCACGG 3C\CZ^CZG cgaoggoggg AACGGCGGC 219 

2 INFORMATION FOR SEC IG NS:!"!: 

!.i' SEQUENCE CHARACTERISTICS: 

(A; LENGTH: 4 94 base pairs 
\B- TYPE : nucleic acid 
(C: STRANDEDNESS : single 
,0 TOPOLOGY: linear 



5 0 

; a o 

2 4 C 



iNFipy.ATi fsf sec 



4 ^4 



C IHARAGTET. V 
NGTH 2 2 " -a:-.' 4 .1 : :\ 
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iX1 ; SEQUENCE DESCRIPTION: SEQ ID NO: 1^2: 
GGGCCGCTGG TGCCGCGGGC CAGCTCTTCA GCGCCCGACX^ CGCGGCGGGT GCC3TTGGGG 
TTOGCGGCAC GGGCGGCCAG GGTGGGGCTG GCCGTGCCGG AGCGGCCGGC GCCGACGCCC 
CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCCGCG 
GCCAGAGCGG C^ACGCCATT GCCGGCGGCA TCAACGGCTC 

INFORMATION FOR SEQ ID NO: 173: 

,i; SEQUENCE CHARACTERISTICS: 

(A) LENGTH-. 3 38 base pairs 
,B) TYPE: nucleic acid 
■CI 3TRANDEDNESS : single 
■:D) TOPOLOGY: line 



:;F ORiYATICN FOR SEQ 



SEQUENCE CHARACTER I GTI 23 : 
. Ai LENGTE : 4 00 case -j a l r g 
B ' TYPE: nuclei: acid 
C 7TRA2JT EENE3G . 5 : nc; : c 
PGL23Y . : :;oa: 

;f.ouence gegg" : it; gn : seq :: 



b0 

180 
220 



63 



>:i SEQUENCE DESCRIPTION : SEQ TO NO : 1 " 3 . 
ATGGCGGCAA CGGGGGCCCG GGCGGTGCTG GCGGGGCC3G C G ACT ACAAT 
GGGCAGG3TG GTGCCG3CGG CCAAGGCGGC CAAGGCGGCC TGGGCGGGGC AAGCACCACC 12 0 

-GATC3GCCT AGCCGCACC2 GGGAAAGCCG ATCCAACAGG CGACGATGCC GCCTTCCTTG 

2CGCGTTGGA ccaggccggs ATCACCTACG ■CTGACCCAGG CCACGCCATA ACGGCCGCCA 
AGGCGATGTG TGGGCTGTGT GCTAACGGCO TAACAGGTCT ACAGCTGGTC GCGGACCTGC 

3GGACTACAA 7GGGGGGGTG ACCATGGACA GCGCCCCCAA GTTGGCTGCG AT G GOAT GAG 



180 
240 

3 3C 
36C 
3BH 



. .-\ Au w v 



~3G03CC3GC 



WO 99/42118 
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ID) TOPOLOGY: linear 
x:.; SEQUENCE DESCRIPTION: SEQ ID MO: 177: 

AGCAGCGCTA OCGGTGGCGC OGGGTTCGCC GGCGGCGCCG GCGGAGAAGG CGGAGCGGGC 6 0 

GGCAACAGCG GTGTGGGCGG CACCAACGGC TCCGGCGGCG CCGGCGGTGC AGGCGGCAAG 12 0 

GGCGGCACCG GAGGTGCCGG CGGGTCCGGC GCGGACAACC CCACCGGTGC TGGTTTCGCC 180 

GGZGGCGGCG GCGCCACAGG TGGCGCGGCC GGCGCCGGCG GGGCCGGCGG GGCGACCGG" 24 0 

AGGGGCGGCA CCGGCGG COT TGTCGGC3CG ACCGGTAGTG GAGGCATCGG GGGGGCCGGC 30C 

ZGCGGCGGCG 3TGACGGCGG GGA7CGCGC0 AGGGGT0TC3 3CCTGGGCCT GTCGGGCTTT 36C 

O.^GGGGGGCC AAGGCGGCCA ^u^uUu^ JGCGGCAGCG GGGGCGCC3G CGGCATCAAC 42: 

JGGGCCGGCG 3GGC3GGGGG GAACGGCGGC '^CGGGGGGG AG3GCGCAAC GGGTGCOGCA 4SC 

G3T373G3GG AGAACGGCGG GGTCGGCG jT GACGGTGGGG C3GG7GGC3C CGCCGGCAAC 54C 

33 3GGCAACG CGGGCGTCGG CCTGACAGCC AA3GCC3GCG A 2GGCGGC3C CGCGGGCAAT 6 CO 

GGCGGCAACG GGGGCGCGGG 3GGT33TGGC GGGGCCGGCG ACAACAATTT CAACGGCGGC 66C 

T^GG^^TAGC GGCACC3GGG AAA3G3GATG GAAGAGGCGA 3GATGCCG3G TTCCTTGCC3 78C 
;G«G3A 3G~~73~A~3 AGOTAGGOTG A'J GGAGGG'JA 3 3CGATAACG 3GCGGGAAGG 3 4''. 

::::" ipy.AT'i:: ~o? ie: :d *;r ; -~ 

A LENGTH - : 13- jas- : , ; - 
3 TYPE - "u;.e. ... ic.j 

: riPANCETNES.: , . .... - 

3 TGPOLGGY li.ied: 
JE^YENCL 0E33P;?~:0N SEC IE 



WO W 42118 



VCJ I S ( )9 03265 



400 



GCGGCGACGG TGCACTCTCA GGCAGCACCG GTGGTGCGGG 

.2 INFORMATION FOR SEQ ID NO: 175: 

(1, SEQUENCE CHARACTERISTICS: 

(A; LENGTH: 538 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(Dl TOPOLOGY: linear 

ixi; SEQUENCE DESCRIPTION : SEQ ID NO: 175; 

GGCAACGGCG GCAACGGCGG CATCSCCGGC ATTCGGCGGC AACGGCGTTC CGGGACGGGC 6 0 

AGCGGCAACG 3CGGCCAACG GCGGCAGCGG CGGCAACGGC GGCAACGGCG GCATGGGCGG 120 

CAACAGCG 3C ACCGGCAGCG 3C0AC0GCGG 7GCC0GCGCG AACGGCGGCG CGGCGGGCAC 190 

3GGCGGCAC:: GGCGGCGAC3 "~GCCTCAC 3GGTACTGGC GGCACCGGCG GCAGCGGTGG 24 C 

CACCGCCGGT GACGGCGGTA ACGGCGGCAA CGGAG GAG AT AACACCGCAA ACATGACTGC 300 

3CAGGCGGGC GGTGACGGTG GCAACGGCGG CGACGGTGGC TTCGGCGGCG GGGCCGGGGC 360 

-GCCGCCGGT GGCTTGACGG CTGGCGCCAA 7.SGCACCGGC GGGCAAGGGS GCGCCGGCGG 42 0 

GGATGGCGGC AACGGGGCCA TCGGCGGCCA CGGCCCACTC ACTGACGAGC CCGGCGGCAA 4 30 

^ZZGZAACZ GCGGCAGCGG CGGGACGGGC 3GCGCGGGOA TCGGCAGG 535 

~::r itrmaticn ?cr sec :z :;c.:"£ : 

: .iEC/CENCE CT-vvRACTEPlGTICS . 

A ..ZNGTK: 22 9 case pairs 
3 TYPE: nucleic acid 
C J TRANDED :IE S S : j : nq 1 e 
G TOPOLOGY, linear 




WO 99/42118 



AGA GAA GACGGCCAAC GAGG7GGAGG GGCCGATGGC. GGACGCACGG AGTGATG7CC 3 0C 

CCATCACACC GTGCGAACTC ACGGCGGCTA AAAACGCGGC CCAACAGCTG GTAT7G7CCG 260 

CCGACAACA7 GCGGGAATAC GTCGCCGCCG GTGCGAAAGA GCGGCAGGGT GTGGGGACG7 42 0 

CGCTGCGCAA CGCGGCCAAG GGGTATGGCG AGGTTGATGA GGAGGCTGCG ACCGCGCTGG 4 30 

ACAACGACGG CGAAGGAACT GTGCAGGCAG AATCGGCCGG GGCCGTCGGA GGGG ACAGTT 540 

CGGCCGAACT AACCGATACG CGGAGGGTGG ZZACGGCGGG TGAACCGAAC TTCATGGATC 600 

TCAAAGAAGC GGCAAGGAAG CTCGAAACGG GCGACCAACG CGCATCGCTC GCGCACTTTG 560 




WO 99 421 IS 



120 



POT I S l > ( > 0326? 



-GCGGCGCGC CAGCAGCTGA CCGCTGCG™ ATC3GCCATG TCCCGCCCSA TGAA CG A AGG 



1920 
198C 

GGTCATTCAG CGCGCCCGAC ACGGCGTGAG TACGCATTGT CAATGTT7: 



AA tg g c ct aa GCCCATTGTT GCGGTGGTAG CGACTACGCA CCGAATCAGC GCCGCAATGC 



TG ACATGGA7CG 2 04 0 

GCCGGGTTCG GAGGGCGCCA TAGTCCTGGT CGCCAATATT GCCGCAGCTA GCTGGTCTTA 2100 
GGTTCGGTTA CGCTGGTTAA TTATGACGTC CGTTACCA 

1 3 8 

12) INFORMATION FOR SEQ ID NO: 179: 

■i; SEQUENCE CHARACTERISTICS: 

<A) LENGTH : 4 60 amino acids 
(B) TYPE : ammo acid 
', C : S TRAN D EDNES S : 

L) Ti; PuLuiji : imear 



SEQUENCE EESCRIPTICH: GEQ ID NC:1^9- 
Met Thr G;:; Ser Gin Thr Va I Thr Val 



Asp Gin o:r. Glu lie Leu As:: 

IS 



Arg Ala Asr. Glu Val GI 



u Ala Pre Met Ala As:; ?:n Pre Th: 
25 * 30* 



20 _ . -r-r A^p Val 



?rC 3 " — ^ Ala -a Lys A.:: Ala Ala G- Gm 

311 40 45 

I* 1 ~ eu 3er Ala As ? Asn ^ Arg Glu Tvr Leu Ala Ala G : v Ala 



- V .; \r -: 

SI 



- -r -r*u ^.a Ihr o'er leu Arq As:: Ala 



■ ■■ ^3 ,,^a .-^ MSD rt5r . Asr) 

33 3C 



: r, 



■\.;~ Pue ''let A5t * e" " *r<~ ^ - . » i 



Thr r.y Asp 
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121 
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Gin Trp He Leu His Met Ala Lys Leu Ser Ala Ala Me: Ala Lvs Gin 
195 20C 205 



Ala Gin Tyr Val Ala Gin Leu His Val 
210 21S 



Trp Ala Arg Arg Glu His Pro 
220 



Thr Tyr Glu Asp He Val Gly Leu Glu Arg Leu Tyr Ala Glu Asn Pro 
225 230 235 24 o 



Ser Ala Arg Asp Gin He Leu Pro V 



/al Tyr Ala Glu Tvr Gin Gin Arg 
245 * 255 



Ser Glu Lys Val Leu Thr Glu Tyr Asn Asn Lys Aid Ala Leu Glu Pro 
260 -65 270 



/al Asn ; j rc ?r: Lvs Pro Pro ^o \1 



a _e Lys 11^ A so ^ro Pro 3 rc 

- q ^ 



?r3 Prn Gln Glu = la Gly Leu ;ic Pro 31v ?he Leu Me- o~, S e- 
290 295 ' , nn 



Asp G:y Ser Glv Val Thr Pro 



-y Tor Gly Met Pro Ala Ala Pro Met 

310 3 15 320 



D - - 3 r- ^ 



Ser Pro Gl- 



.eu Pro Ala Asp Thr Ale 



230 



,a Gin Leu Thr Ser Ala 
34 0 



j-u ,^a .-uj , VJ .a Leu Ser Gly Asp 

^= 350 



a^ 7a _ 




WO 99/421 18 
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;A} LENGTH: 2 7*" amino acids 

(3) TYPE : amino acid 

(C) STRAND EDNESS : 

(D) TOPOLOGY : linear 

ixii SEQUENCE DESCRIPTION : SEQ ID NO: 180: 

Ala Gly Asn Val Tbr Ser Ala Ser Gly Pro His Arg Phe Gly A' a Pro 
1 5 10 is 

Asp Arg Gly Ser Gin Arg Arg Arg Arg His Pro Ala Ala Ser Thr Ala 
20 25 30 

Thr Glu Arg Cys Arg Phe Asp Arg His Val Ala Aig Gin Arg Cys Glv 
3 5 40 4 5 

-u-j -ro oer Arg Arg j^n .eu .Arg Arq Arg 7a 1 Ser Arc: Glu Ala 

50 =5 6G 

Thr Thr Arg Arg Ser Gly Arg Arc: Asn His Arc: Gvs Glv ~- s 

Gly Thr Gly Ser His Thr Gly Ala Val Arg Arg Arg His Gin Glu Ala 
95 90 95 

Arg Asp Gin Ser Leu Leu Leu Arg Arg Arg Gly Arg Val Asp Leu Asp 
LJ 0 10 5 " * ^ 

31 ''' 31 V Ar 3 -eu Arg Arg Val Tyr Arg ?ne Gin Glv Cys Leu 7a. 



j - n .iij ".eu ^eu Arc; Pr" 



ne va- ^ys 



]1, = „ Asp Tyr .... . . ... ^ jer 



V I <3 - '_) h p 'J »-~ ~J *- ~ 
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Cys Arg Phe Phe Slu He His 31u Val Gly Phe Thr Civ Arg Sly His 
260 265 270 

Pro Arg Arg He Sly 
275 

INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
(3) TYPE: amino acid 
(C! STRAND ED NESS : 
(D) TOPOLOGY: linear 

xi SEQUENCE DESCRIPTION : SEQ ID NO : 1 8 1 : 

^ J 5er 5 he He rt sp :rp ^eu Asp Ser Pr- Asp Ser ?rr; 
3 ;o :5 

Leu Agd Pro Ser Leu Val Ser Ser Leu Leu Asn Aid Yal Ser Cys Gly 
:: 2 3 3c 

A^a G.u Ser Ser Ala Ser Ser Ser Ala Arg Ser Gly Asn Sly Ser Arg 
3 5 4 0 4 5 

Tr? Thr Ser :ie: Pro Ser Sly Thr Arc; Pro Glv Pro Ar~ Ars Ala Th- 

50 55 60 



WO 9*) 421 IS 
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IM 



;i SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 196 amino acids 

(B) TYPE: ammo acid 
lC) STRANDEDNESS : 
(D) TOPOLOGY, linear 

(xi: SEQUENCE DESCRIPTION: SEQ ID NO: 182: 

Gin Glu Arg Pro Gin Met Cys Gin Arg Val Ser Glu lie Glu ?r 
1 5 10 15 



o Arg 



Thr Gin Phe Phe Asn Arg Cys Ala Leu Pro His Tvr Trr> His Phe Pro 
20 25 30 

A^a Val Ala Val Phe Ser Lys His Ala Ser Leu Asp Glu Leu Ala Pro 

40 45 

Arg Asa Pro Arc: Arg Ser Ser Ara .Arg Asp Ala Slu Asp Aru Aru Val 

55 -30 

*y ? " e Ala Ala — r Val Ala Val Asp Pro Pro Leu Arg Sly Ala 

^ 80 

Gly Sly Glu Ala Asp Sir Leu lie Asp Leu Glv Val Cys Arg Arg Gin 

Ala Sly Arg Val Arg Arg sly Sin Siu Leu Hm His Arg His Arg His 
: °° 105 no 



WO 9<J/4211N 
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XI SEQUENCE DESCRIPTION: SHQ ID NO: 13 3. 

Val Arg Cys Gly Thr Leu Val ?:d val Pro Met Val GIu Phe Leu Thr 
5 10 is 

Ser Thr Asn Ala Pro Ser .eu Pro Ser Ala Tyr Ala GIu Val Asp Lys 
20 25 30 

Leu He Gly Leu Pro Ala Gly Thr Ala Lys Axg Trp He Aan Glv Tyr 
35 40 45 

GIu Arg Gly Gly Lys Asp His Pro Pro He Leu Arg Val Thr ^ro Gl- 

50 55 60 

Ala Thr Pro Trp Val Thr "Trp Sly GIu Phe Val GIu Thr Ara Met Leu 

•:>5 ?c 7S 



3C 



La Slu Tyr Arg Asp Arg Arg ly- Val Pre He Val / 



35 



Arg Sl~ Arg Ala 

9 0 95 ~ 



Ala He GIu GIu Leu Arg Ala A: g ?h» Asn Leu Ar- IV r Pro .eu Ala 

100 ii- 

His Leu Arg Pro Phe Leu Ger Thr His GIu Aru Aso Leu Thr Meo Glv 
H5 lzo " " -5 

Glv GIu Slu He Gly Leu Pro Asp Ala 3lu VdL ~;;r H Aro Thr Glv 
130 H5 

Hr. A_ a ^eu ^eu Gly Asp Ala Ar- Tro Leu Ala Ser L-u Va : Asr 

^ -50 :=5 ... 



..sp Leu Arg Ser Ser Avrg Jla Va. A..* Ar~ Aru 3lv Pro H 
19 C 



.Arg .a. 



13 5 ,90 
^ • :le 1 ' : • ' - P- Leu as: 



jL'. v S^r. Ser 



a r. .-a s p A.a A 1 ^ 



t Ve" . Asp I.u Tvr Arq Sir. Pnr 



WO <J9.'42I1S 
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1 J 6 



:90 



Gly ?he Vai Val ,Ua Leu Vai Leu Glu Ale 



La Val Glv ^eu Asp 
295 30C 



Arg Asp Val Tie Vai Ala Asd 



305 



3 10 



INFORMATION FOR SEQ ID NO: 184: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2072 base pairs 
(Bj TYPE: nucleic acid 
iC) STRAND EDNESS : single 
,D; TOPOLOGY: linear 

::i SEQUENCE DESCRIPTION: SEC 1 in NO: 184 

CGCCACGA GCTOAGCAGC "CAAGGGGCZ 



. _ v. ^ ^. j; 



rC "° C::: GGCAAGGGTA A G CAAATCAA GACCACGCTO AACAGCCTGT 



'-u,:_^C3CG G . ACATCAGG ACGACCAACA GTTCSTCGC 
3A GTTCACCGAC AGGTTGACCC ACTGCGATGG GGACCTGTi 



VJ _ ^_ ^ 



'^j A/ \ i_ 7\AG 



La: 



30C 



\\ <) Q9/-421 18 



PCT-TS9 1 ) 0326=; 



^T^GwCGGGC GTGCCGAGCT TCCTGTTCGG GGTGTCATGT AGCCGCGGGCT 1200 

GTGGAACGAT GGGGGATCGG CACGTGTTGA TACCGGCGAT CACCGGCCTG GCGTTGATGG 12 60 

GGGGATTCGT GGGACATTCG TGGTAGGGCA CAGAACATGC GCTCATAGAG ATGCGGTTGT 13 20 

TCCAGAACCG AGGGGTCGCG GAGGGGAACA TGACGATGAG GGTGCTCTCC CTCGGGCTGT 13 80 

TTGGCTCCTT CTTGCTGCTC CCGAGCTACG TCCAGCAAGT GTTGCACCAA TCACGGATGG 1440 

AATCGGGGGT GC AT AT CATC CCACAGGGCG TCGGTGCGAT GCTGGCGATG CCGATGGCCG 15 00 

GAG CG AT GAT GGACGGACGG GGACGGGCCA AGATCGTGCT GG TTGGG AT G ATGCTGATGG 1560 

GTGCGGGGTT GGGCACCT*TC GCGTTTGGTG TGGCGCGGGA AGCGGACTAC TTACCCATTG 1520 

ZGCCZ^ZZ^G GGTGGCAATC ATGGGCATGG GCATGGGCTG GTGCATGATG GCACTGTGGG IGan 

TCAAG GAGGA GGTGGGCGGT TGGATAGGGA CGGCACTGAT GTGGGTGCTG GTCAGCTAGG 130C 

AGTTCAATGA GAGGGAAATG ATGGGTAGTG CAAAGAAAGT GGCACTGAGG GCAGAGAGTG I860 

GGG GGGGGGG GGGGGGGGCG GTTGACGCTT GGTCGCTAGG GCGCGAAACG AACTTGGCGG 19 20 

GGGAACTGGT GCATGACGTT CCGGAGGGGT ACGCGGTGGT ATTCGTGATA GCGAGGGGGG 1980 

TAGTGGTGTG GAGGGTGATG GGGGGGGCAT TCCTGCCGAA ACAGCAGGGT AGTCATGGAA 2 04 0 



::;:-g?.mat:gi; ?cr zzz :g :i: . 135, . 
j^cge;:cc ckaragtzrigt: zz . 

A LZIJGTH : cace :a 

c gtra^cegkesg : circle 
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GAGTTGGGGG GGGCCGAATT GGGGCATTGG GTGGAAGCCG AGGGGATGCG 4 30 

7GGGGTGGG7 GGTGTTTTGG GGGGCCGGAT GGGGACGACG AGAAGGACGA '7GGCGGCGAT 54 0 

GAACAGCGCG AGGGCAATCA GGACCAGCAG ATTTCGCACG CATACCCTGT CGTACCGCTG 60 G 

CGCCGCGGTT GGTCGATCGG TCGCATATCG ATGGCGCCGT TTAACGTAAC AGCTTTCGCG 66C 

GGACCGGGGG TGACAACGGG CGAGTTGTCG GGCGGGGAAC CCGGCAGGTC TGGGGGGCGG 72 0 

TGACCGGAGC TCACTGG TG C ACGATCCGGG TGTGGGTGAG CGTGGAACTC AAACACACTC 7 30 

AAGGGCAAGG GTTTGTGAGG T7ACGAGCTG AACGTGGACC CGCAATCGCT GGTACGTTTG 34 C 

GAG 7GGGGGG AGGTGGGGAG TGAGCAG GTT TGCGGG GGCA GGTTTGGGGG TGAAGGGGAC 900 

-AGGGGATGG 7AGGTTGGGG C^GGGGTGAC ATGGTG7TGG G GGAGGTG DT CGGTGAAGCG ?6 7 

GG GA TAT GAG 7AGGCATGGA GTGCGAGGTA GTTGGTGGAG GTGATGTGGG GCAAGTAGGG :o:c 

GT3GAGGGCA AGA3GGGGAA TACGATGGGG GGGTG GTAGG G GGGTGAA 3 A GGGAATAGGT IjSZ 

TT G GAG AG CG GGGTG 7GGGA T 7AGATGGAG GGGAGGGTTG AGGGGGGGGA 114C 

gtggggttgg 7gggaggtgg g 3jaatggggg aaggaggagg gtggtgtgtg gtgggatgag 12 3 c 
:g:ggtgtgg gatggagggt ttggggaa:g atttogtggg tgaagggggg gaggggacgt 

tgtgggggtg ggaggagaag ggagcgtt7g ggaacgagtt z 7agacgggt gggggggggg \22z 

atggggtagg jtatggaagg attgtttggg aaatggctga 7 ggagggtg7 7g7gggggtg ibo- 

;t;:;tgggta t"gaaggggt ;Ttgt~tgag ;t ;aaj jtga : .;a gggtgat ^gggaggggg 

^ ' 7 ' : - 1 T ■ • TGGGGOGG ~ ~ 7 AAA 7GT .1 : "7 "~G A. 7" G" 7~ATT r "7A7 5 fj." 
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i SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1055 base pairs 

(B) TYPE: nucleic acid 
iC) 3TRANDEDNESS : single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 186: 

CTGCCCT3CC AGTGTCACCG GCGATATGAC GTCGGCATTC AATTTCGCGG CCCCGCCGGA 

CCCGTCGCCA CCCAATCTGG ACCACCCGGT CC3TCAATTG CCGAAGGTCG CCAAGTGCGT 

3CCCAATGTG GTGCTGGGTT TCTTGAACGA AGGCCTGCCG TATCGGGTGC GCTACCCCCA 

AACAAC3CCA GTCCACGAAT CC3CTCCCGC GCGGCGGATT CCCAGC3GCA TGTGCTAGCC 

;.:.™.TG^TT CACACGTAAC 2GTTG3CTAG GTCGArtACGC JCGCCAGGGC 3GCTGG AC GG 

JL'CCATGGCA GCGAAATTAG AAAACCC3CG ATATTGTCCG 33GATTGTCA 

A.1T3 3TT33T GGTTCGTGTT TAG CCATTGA GTGTGGATGT GTTGAGACCC TGGCCTGGAA 

GGGGACAACG TGCTTTTGCC TGTTGGTCCG CC7TTGCCGC CCGACGCGGT GGTGGCGAAA 43C 

13333TGAGT CGGGAATGCT 2GGCGGGTT3 TCGGTTCGGC TCAGCTGGGG AGTGGCTGTG 

2AGGCGGCC3 AAGGGGCGGA CGCAGAGGC'J 'JCGGCCATGG ACGAGTCCGA TGAGTGGCAG 
:-CCT3GAAC3 AGTGGGTGGC 3GAGAA33CT JAACC3CGCT TTGAGGTGCC ACG3AGTAGC 

:TT3A3 2AGT 3ATCGGCGGT "TGGGTGT* - " 7 :33JGGCC3G GTATGACAAC AGT3AAT3T3 h 4 ; 

3ATGACAAGT TACAGGTATT AG GT ? C AGGT 73AACAAGJA JACAGGGAAC ATGGCAACAC ?00 



b0 

24U 

30 J 
^ a 
4 2 :; 



5 4 C 

5 c ; 
56; 



gz;^::;c" jhapactef. : 3t: 22 

LENGTH- : 2 ~ oase pa:: 
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120 
180 
240 
300 
359 



TCCGGGCTGA CCACCGGGAT CGCCGAACCA TCCGAGATCA CCTCGCAATG ATCCACCTCG 

CGCAGCTGGT CACCCAGCCA CCGGGCGGZG TGCGACAGCG CCTGCA7CAC CTTGGTATAG 

ECGTCGCGCC CCAGCCGCAG GAAGTTGTAG TAGTGGCGCA CCACCTGGTT ACCGGGACGG 

GAGAAGTTCA GGGTGAAGGT CGGCATGTCG CCGCCGA.GGT AGTTGACCCG GAAAACCAGA 

TCCTCCGGCA GGTGCTCGGG CCCGCGCGAC ACGACAAACC CGACGCCGGG ATAGGTCAG 

,2} INFORMATION FOR SEQ ID NO : 138 : 

; . 3 E QUENCH CHARACTER ISTICS : 

■ A ) LENGTH: 3 50 base pairs 
■' B :■ TYPE: nucleic acici 

xi SEQUENCE DESCRIPTION : SEC. IE NO : I 3 8 : 

. ..... ArtGG GCCC'ICCCTJ GTCGCATGAA GTGGTGGAAG 60 

ATGCATCTT GGGAGATTCC CG C TAG A G C A AAACAGCCCC TAGTCCTAGT CCGAGTCGCC ' ^ i 

GGAAAGTTC CTCGAATAAC TCCGTACCriC G AG C G C CAAA CCCGGTCTCC TTCGCTAAGC 130 

GCGCGAACC AC7TGAGGTT ZCGGGkG~Z?. 7TGACGTCCA GACCGATTCC TTCGAGTGGC 240 

~~ ------ -,o^ -^GArtTCEG .7CGCCGAGGG GGGTGATGTC AACCCAGTGG 300 

. jo-^. jGA AGAGGTGCTC TACGAGCT3T ^ " ~ ATCC A ^^^^r^^,^^^.-, ^ 

jeccence character ijtics 

A ^.ENGTH : immc dCld5 

3' TYPE, ammo acid 
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PCTTSW03265 



Gin Gin Phe Val Ala Leu Asn Lys As:: Leu .Ala Glu ?he Thr Asp Arg 
55 70 75 ' ao 

Leu Thr His Ser Asp Ala Asp Leu Ser Asn Ala He Gin Gin Phe Asp 
35 90 95 

ser Leu Leu Ala Val Ala Arg Pro Phe Phe Ala Lys Asn Arg Glu Val 

IOC los no 

Leu Thr His Asp Val Asn Asn Leu Ala Thr Val Thr Thr Thr Leu Leu 

:1S 120 125 

Gin Pro Asp Pro Leu Asp Gly Leu Glu Thr Val Leu His lie Phe Pro 

:3 ° :i5 140 

Thr Lev: Ala Ala Asn He Asn Gin Leu Tyr His Pro Thr His Gly Gly 

14 5 - S fl -rr ' , - 1 

----- ^.O U 



a_ .a. .^er 



;r Ala Phe Thr Asn Phe Ala Asn Pro Me- 



19C 135 13C 

.^a G-j ^eu Gys Ala Gin Tyr Leu Ala Pro Val Leu Asp Ala lie Lys 

1?5 =00 2 OS 

Pne Asn Tyr Pne Pre Phe Gly Leu Asp. Val Ala Ser Thr Al* Ser Th*- 
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355 



360 



365 



Asp Tyr Met Gly Leu Leu Leu Leu Ser Pro Gly Leu A^a Thr Phe Leu 
370 375 ' JSC 



Phe Gly Val Ser Ser Ser Pro Ala Arg Gly Thr Met Ala Asp Arg Hi 
385 390 395 



400 



V, 



al Leu lie Pro Ala lie Thr Gly Leu Ala Leu lie Ala Ala Phe Va: 

405 .no 415 



Ala His 3er Trp Tyr Arg Thr GIu His Pro Leu He Asp Met Arg Leu 
420 425 430 



Phe Gin Asn Arg Ala Val Ala Gin Aid Asn Met T"hr Met Thr 7a I 



13 5 



440 



Ser Leu Gly Lju Phe Gly Ser Phe Leu Leu Leu Pru Se: 
450 455 -i6 0 



Gin Val Leu His Gin Ser Pre Met Glr. S 



er G^v 7a 1 Hi; 



Pro 
4 3 0 



Cjiy Leu G.y Ala Met Leu Ala Met Pre He Ala Glv Ala Met Me 
485 49c 495 



Asc Arg Arg Glv Pro 

5 a c. 



A. a ^yz He 7a, Leu 7a 1 Gly lie Met Leu He 



i ~ a .-^ a 



:hr Phe Ala Pr.e G.v 7a. 



.-L*a ^r- _} + n .~^a ,-asd 



-o Leu oer sly A. a A. a '.'a. 

i; B 5 
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Thr Leu lie Pro Ala Ala ?he Leu Pro Lys Gin Gin Ala Ser His Arg 
560 565 670 

Arg Ala Pro Leu Leu Ser Ala 

(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 120 ammo acids 
■IB) TYPE: amino acid 
' C ; STRAND ED NESS : 
,L) TOPOLOGY: linear 

■xi- SEQUENCE DESCRIPTION: SEQ ID NO: 130: 



Me- Va; Gl\ 



Ala Val Gin Thr Glu Asp Lys Tyr Gly Val Lvs He 
25 3C 



ro Asp 3iu Asn Leu Ala 31' 



-«eu Arg ^hr jb.^ Gi y Asp Va^ Va^ Ala 
■iV 45" 



^ Pro Glu Ala Ala 31n A* a Leu 

60 



A3n Pro Asp Ala Ala Arg Ma Asd Arc: 
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20 



30 



lie Ala Glu Gly Arg Gin Val Arg Ala Gin Cys Gly Ala Gly Phe Leu 
35 40 45 

Glu Arg Arg Pre Ala Val Ser Gly Ala Leu Pro Pro Asn Asn Ala Se- 
50 55 so 

Pro Gly He Arg Ser Arg Ala Ala Asp Ser Gin Arg His Leu Leu Ala 
65 ™ 75 so 

Gly Asp Gly Ser Asp Val Thr Val Glv 
35 

INFORMATION FCR 3EQ ID NO: 192: 

I SEQUENCE -HARACTERISTICS : 

;'Ai LENGTH: 113 amino acids 
(3J TYPE: ammo ac:a 
( C ! STRAND EDNES S : 
(D) TOPOLOGY: linear 

xi SEQUENCE DESCRIPTION: 3EQ 10 NO: 192: 

A. a Ser Leu Leu Ala Tyr Ser Ala Ala Ala Ala Ser Thr Ala Leu Ala 



/ a _ Ala Cy s Vd . Arg Ala As r> Hi: 



Arg Asp Arg Arg Thr He Arq Asp 

-5 30 
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.xi; SEQUENCE DESCRIPTION: SEC ID NO: 193: 

Arg Ala Arg Sly Hi 3 Arq Ser Ser Lys Gly Ser Arg Trp Ser His Glu 
5 io is 

Val Leu Glu Giy Cys He Leu Ala Asp Ser Arg Gin Ser Lys Thr Ala 
20 25 30 

Ala Ser Pro Ser Pre Ser Arg Pro Gin Ser Ser Ser Asn Asn Ser Val 
35 40 45 

Pro Gly Ala Pro Asn Arg Val Ser Phe Ala Lys Leu Arg Glu Pro Leu 

5 0 = b 6 0 

Glu Val Pro Sly Leu Leu Asp Val Gin Thr Asp Ser Phe Glu Trp Leu 



■ _JL -'j - -j-u oer rt ^a .-»._a l.u Arg Gl v Asn 



~'y G!y Leu G/.i Glu Ya \ Leu Tvr Glu U*.u Ser Pr 



^so Phe Ser 



SEC ID NC : 1 94 



SEC^J—NCE --:{ARAC ^ z,r, I STI GS : 
A; LENGTH. -311 oase ca:: 
5, TYPE. jcid 
G ' .ITTANLEDNESG . 3 l r.q 1 - 
D TYPOLOGY . 1 ir.ear 



_AG A ^ -i^GGATGGGG GGGTGGCTGG TGGCGAxGC 



1- ^rU'JV - 



WO 99/421 IS 
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] Jf> 



"CzGTGCVGGC 3TCGTC3ATG GTGGTTTACG GGCAGGGGCO CTATGACTGT GCCCAGCATG 

GACCGGTC3A CGGCGGCGAG GCGACCTGGA CAATGGGGTC TTCGAGCACC 

GTTGCCCGGG GTGCGGCGAG CCAGTCATCT GGCAATTGGT CGACGAAGAT GCCCCGT7GC 

GCCCGCGCAG CCTG7ACGCG GCAGCAAGAC CGCGCAGGAG CACTACGCGC TGGCGTGGTC 

GGAAACGAAT GGCGGTTCCG TGGTGGCG'fT G 

:2) INFORMATION FOR SEQ ID NO: 195: 

1' SEQUENCE CHARACTERISTICS : 

LENGTH: 966 base pairs 
TYPE: nucleic acid 
GTFANTEDNESS : single 
TCFCLCGi: linear 

INCE 3 E3CR I FT I ON : 3EC 1C NC:l;*b: 



(3 



A^GACCTTC 



ifiL- irtuT- JAAGATAt 



- GCCGGTGAm 30TG.ACAGC 



a - . j^^wa^rtu .-i^ ^.-vv^ o ACAT CAGG1 



^ — - - -j j^^.ACGAGT 2CGGC3ACG3 3CAAAT3GTG jCGATCACCC 

^ ^-^a . _ jv- w.wj. _ GTGAAGAAC .j CAG CCGGGA 3 AT 33 AGGTG ""TGGAGTTCC 



o'OC 
660 

780 
311 



3ACTTTGTGG TOCCGGTGGG GGGATAGAGC ACCTGTCGGC GTTGGTCAG 3 GTCACC 
3CTCGGAC3C C3AAC 3 CAT3 C7TTCAAC3T AGCCTGTGGG TCACACAAGT 3GCGAGCGTA 13 C 

AGGTCACGGT CAAATATCG3 3TGGAATT7- GCGG7GAGG7 TCCGCT33C3 3 ACAATCAAG 24 C 



WOW/421 18 PCTTS09 03265 

i ? ; 



: SEQUENCE CHARACTERISTICS 

A) LENGTH: 2367 base pairs 
13) TYPE: nucleic acid 
:' C ^ STRANDEDNESS : s : ng 1 e 
iD) TOPOLOGY: linear 

[xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 



CCGCACCGCC GCCAATACCG CCAGCGCCAC CGTTACCGCC GTTTGCGCCG TTGCCCCCGT 
TGCCGCCCGT CCCCCCGGCC CCGCCGATGG AGTTCTCATC GCCAAAAGTA CTGGCGTTGC 
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GGGGTAGC AC7GGC7GGG ::CwGCC3: 3G7TGCGGA7 GAAGGGGGC7 GCGCGTCGGC 1440 
2GGGGGG7 TTGACCGAAC CCGCCACCCG CGCCGTTGCC AGGG77GCGA AACAGGAA73 



2GGGGGCG3G GGGAGGCTGG GCGGGTGCCG TCGGGTCGGG GGGGT7TCGG A7CAACSG 



GCCGCAAAAG CGCCTCGGTG GGCGCATTCA CCGCACCCAG CAGACTCCGC TCAACAGCGG 
CTTCAGTGCT GGCATACCGA CCCGCGGCCG CAGTCAACGC CTGCACAAAC TGCTCGTGAA 
ACGCTGCCAC CTGTACGCTG AGCGCGTGAT ACTGCCGAGC ATGGGCCCCG AACAACCCCG 
GAA7CGGGGG GGACAG7TCA 7GGGCAGCGG GACGGACGAG TTCCGTCGTC GGGA7GGGGG 
GGGGAT7 AGGGGGGGTG AC7TGGGAAC 3AA7AGT G G A 7 AAA7 C GAAA 3GGGCAG7TG 

GA2C 273 7GGGGAGG3G 2GAGG7GAC" 7773GTG3G7 3GG7GGG3GC 2G7GAG7A72 
^TGG77G7 GA77GGGGG7 Z,~Z?,CZCAGZ TTGTTGCGCG AGTTGAAGAC 



. ^_ . ^ 



- - O /-V ^ ^uT\ ^_ vj ^ 



. _ ~ - j . .-i j A 1 G'JGGAAG 2GGAGGG 



^Ci:e:jch: :hap.ag73:p.:s7:;;g . 

A. LH^JGTrl : 1^6 ir,:r.c 
"PA? rL K L' ? .' L S 



1500 
1 5 6 C 
1620 

isao 

1740 

13CC 
1360 
19 2 3 
138C 



GGGAGGAGAG GCCGAGCTTG G7G7AGACGT GGG7CAAG73 GGAATGCACG GTGCGCGGCG 2 ICC 

.^r^^^w^o ^oual^l.j 7GC7GAGTGG G7GACGGAG2 AG7AGAGCGA 21oG 

.-l^^^.C . 373GG7G72 AAC3GGCGGG ; n--- ..-.^^^ ____ ___ ~ „ 



T^r A.. i A. a 



WO 9^/421 18 
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Val Ala Thr lie Thr A^a 'ys Gly Ala Arg Asn Vai Ala Leu Arg Asp 

Ser Ala Val Ala Ala Val Ala Ala Ala Ala Thr Sly 3er Gly 31 v Thr 
85 90 95 

Ala Va^ Thr Thr Gly Thr Ala Gly Gly Leu Ala Arg Ala Cys Arg Arg 
100 105 110 

Gly Gly Thr Val Ala Ala Gly Ala Thr Gly Arg Arg Ala Gly Ser Ala 
115 120 125 

Met Ala Ala Arg Ala Ala Val Ala Ala Gly Leu lie Thr Asp A] a Gly 
13 C 13 5 14 0 



u i c; " i p. Arg .-Ma */al 9rz 31 v Ala GIv \rg 11' 



lie Asp Pre Val Cys Pre Gly Glu Ala Gly Ala Ala Gly Tin 
155 l?o 



Ala Ala Met Ala Glu Gin Pro Gly Val Ala Al^ 
130 135 



a -a. .nr Aia Arg L nz 
19C 



^la Gly Ala Ala Asp Thr Ala Val Ala Ala 
: 0 0 2 0 5 



?rz Val ?r- 



j r\ 



la Civ Arc 



l-.r Ala 3.'.' Arq A*a Arc Leu Pro 



r-.rz Arg :,p'. 
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40 



Arg Ser Hi 5 His ?he Arg Ar 



g Ar] Asp Arg Arg Jly Arg lie Ser Arg 

3bC jh.h 



A^a His Leu Arg Thr Asn Ser Arg 
3-0 375 

'2} INFORMATION FOR SEQ ID NO: 198; 

(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2852 base paars 

(B) TYPE: nucleic acid' 
IC) STRAND EDNESS : single 
(D' TOPOLOGY: linear 

SEQUENCE DESCRIPTION: 3ZQ 10 MO: 198: 
GGC OAAAACG CCCCGGGGAT ~^r^:r,r^^r^ 



j.-v^j.^^o^ -^l^.^jG CTACCATGCC 3GGGGTT0GG 2G 



j ^- ~ - - .-A'-orv^ ^ j'i'GGGCCGAG 

- - - • jGGGTTGACA 

-J ijv-t^ijoi i^j^o ^ vj ^. ~ ^ ^ -_r\ >— <-j \_ cro ^ v_ G C G 

:CACGC3GGT CTTCGGCAAC GTGGGCTTGG CGAACGTCCG OGAGGGCAAC 
1 * .~»ATG „. CC3 GAACTTCAA*" G ^ ^ r z,r; ^™ ^ ^ ^ - ^ -< - 



GAACGGCAAC 3 0 C 



jGTTTGGC 



-AACATC3GT 



hCI GGGG G 



\ C AG 2 T A C AA C 
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ttcggtggcc caccggtctt caatctcggc ctggcaaacc ggggcgtcgt gaacattctc ;:sc 

GGCAACGCCA ACATCGGCAA TTACAACATT CTCGGCAGCG GAAACGTCCG T jACTTCAAC 1320 

ATCCTTGGCA GCGGCAACC7 CGGCAGCCAA AACATCTTGG GCAGCGGCAA GGTCCCCAGC 13 BC 

TT C AAT AT CG GCAGTGGAAA CAT GGGAGTA rTCAATGTCG GTTCCGGAAG CCTGGGAAAC 1440 

TACAACATCG GATCCGGAAA CCTCGGGATC TACAACATCG GTTTTGGAAA CGTCGGCGAC 15 0 0 

TACAACGTCG GCTTCGGGAA CGCGGGCGAC TTCAACCAAG GCTTTGCCAA CACCGGCAAC 15 60 

AACAACATCG GGTTCGCCAA CACCGGCAAC AACAACATCG GCATCGGGCT GTCCGGCGA" " S^G 

AACCAGCACG GCTTCAATAT TG CTAGC2JGC 7GGAACTC3G GCAGCGGCAA CAGCGGCCTG 15 80 

7TCAATTCGG GCACCAATAA CGTTG 3CATC TTCAACGCGG GCACCGG AAA CGTCGGCATC 1 " 4 C 

.-^..AACTCGG GCACCGGGAA CTGGG 3TATC GGGAACCCGG GTACCGACAA TACCGGCATC 1 y C 'J 

TACAACACGG GCAGCTACAA CACCGGCGGC TTCAACGTC C GTAACACCAA CACCGGCAAC 1320 

■ . . _ - .-A . -rt' w IT . 23GCCGCAAC 2 16 . 
- - - - - - — . .ATC . 'AGCGGTCTC 22 2 

-^-^CGGCC .iACCGACGC' 1ACC CT'JCCC ATCAJCATTG TCGGTGCTCT CGAGAGCCGC 23 4 C 



WO W 421 IS pc T I 03265 



G T AAG C GAA T AAACCGAATG GCGGCCTGTC AT 2 9 52 

.'2; INFORMATION FOR 3EQ ID NO: 199: 

f'l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 943 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS ■ 

(D ) TOPOLOGY: linear 

■;xi; SEQUENCE DESCRIPTION: SEQ ID NO: 199: 

Gly Gin Asa Ala Pro Ala He Ala Ala Thr Giu Ala Ala Tyr Asp Gin 
1 5 10 15 

Met Trp Ala Gin Asp 7a 1 Ala Ala Met ?he Gly I s / r His Ala Gly Ala 

20 25 30 

Jer Ala Ala Val Ser A.a Leu Thr Pro ?he Gly Gin Ala Leu Pro Thr 
1 => AO 4£ 

Val Ala Gly Gly Gly Ala Leu 7a; Ser Ala Ala Ala Ala Gin 7a 1 Tnr 



Aig ?ne Arc Asn Leu Gly leu Ala Asn 7al Arg Glu G^v Asn 

55 '*0 75 80 

7a. Arc Asn TLy Asn 7al Arc Asn ?he .Asn Leu Gly 3er Ala Asn lie 

35 30 95 

jLy .~\sn G 1 y As:: 1..-* G ' y Ser j 1 y Asn lie G^ -' Ser Ser Asn lie Glv 



.a a _eu .^sr. asn _ - L\ 
12 5 



jc j - y . ni G _ . As:, /a^ G^v 
1S5 190 



ji- no:, jc: j^y Hsr; be: 
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— 5 330 :3 5 



240 



Gly Ser Tyr Asn Pro Gly Asn Ser Asn Thr Gly Gly Phe Asn Met Gly 
24 5 2 50 255 

Gin Tyr Asn Thr Gly Tyr Leu Asn Ser Gly Asn Tyr Asn Thr Gly Leu 
260 265 270 

Ala Asn Ser Gly Asn Val Asn Thr Gly Ala ?he lie Thr Gly Asn Phe 
275 280 235 

Asn Asn Gly Phe Leu Trp Arg Gly Asp Has Gin Glv Leu He Phe Glv 

290 295 3Q0 

Ser Pro Gly Phe Phe Asn Ser Thr Ger Aid Pro Ser Ser Gly Phe Phe 

ia5 310 3:5 32c 

Asn Ser Gly Ala Gly Ser A^a Ser Gly ?he Leu Asn Ser Glv A _ a Asn 

125 33C 2J5 

Asn Ser Gly Phe Phe Asn Ser Ser Ser Glv Ala He Gly Asn Ser Gly 
:i4 ' 315 35C 

Lea Ala Asn Ala Gly Val Leu Val Ser Gly Val He Asn Ser Sly Asn 

355 i5C 3c5 

Thr Val Ser Glv Leu Phe Asn Me: Ser „eu Va. Ala He Thr Thr Pro 
— 3-5 33: 

Ha L-eu Tie Se: Sly Ph~ Asn Thr Glv ^er Asn Met Ser Glv ?ne 

— : - " -;oc 

: "' r - c — 31v ? — Asr. SI v „eu A.i Asn Arc: Glv "a. 

,*sn __e ..eu Hy A sr. A^a Aar :H GH Asr Tvr Asn He L,eu Jlv 
423 42= 430 



. \ s n . a . 



'-' Asn Leu SI 



-'a. Pr,e Asn 'A 



v Asr 



Ser G 1 v Asn Leu SI/ 1H 
HE ■ K 



Phe 
4 * r : 
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31y Asn Asn Asn lie Giy He Giy Leu Ser Giy Asp Asn 31a Gin Glv 
530 535 540 



?he Asn He Ala Ser Giy Trp Asn Ser Giy Thr Giy Asn Ser Giv 

545 550 555 " 560 

Phe Asn Ser Giy Tnr Asn Asn Val Giy He Phe Asn Ala Giy Thr Giy 

565 570 575 

Asn Val Giy He Aia Asn Ser Giy Thr Giy Asn Trp Giy He Giy Asn 



580 



585 590 



Pro Giy Thr Asp Asn Thr Sly He Leu Asn Aia Giy Ser Tyr Asn Thr 
595 6 00 605 

:iy nc Leu Asn A J. a ily Asp ^he Asn Thr 31 y Phe Tvr ; 5 r, Thr 3iv 

61C 513 520 

Ser Tyr Asn Thr Giv Glv Phe Asn Val Giy Asn Thr As:: Thr Glv Asn 
62d 630 635 * s , c 

Pne Asn Val Giy Asp Thr Asn Thr Giy Ser Tyr Asn Pic GH- Asn Thr 
° 45 550 65~- 

Asr. Thr Giy Pne Phe Asn Pr- Glv Asn Val Asn Thr Glv Al a Phe As- 
Thr Hy Asp ?h- Asn Asn Giy Phe Leu Val Ala Giy Asn Asn Gin Glv 



5^5 



530 5a = 



■ 5er '.'a. rhr Thr Pre Phe He? Prz :> A 3r 

- > „ ? - C 'J 

^ 3.:: Me:. Va ; X- Aso Val hi 3 Asn V.u :-1e- 3nr -he Lv r v Asr: 



v .e:. He Thr 7 a. Thr V. 



: Ala Ser Thr Vs; Pne Pro Gin "Thr ^he 



^ v " v Vi. Asn _eu 
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\A5 
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As:: Val Gly G 1 y 31 y Ser Jer 3 1 y Val Trp Aun Ser Gly Leu Ser Ser 

320 325 830 

Ala lie Gly Asn Ser Gly Phe Gin Asn Leu Gly Ser Leu Gin Ser Gly 

835 340 845 

Trp Ala Asn Leu Gly Asn Ser Val Ser Gly Phe Phe Asn Thr Ser Thr 

850 955 8o0 

Val Asn Leu Ser Thr Pro Ala Asn Val Ser Gly Leu Asn Asn lie Gly 

865 870 875 980 

Thr Asn Leu Ser Gly Val Phe Arg Gly Pro Thr Gly Thr tie Phe Asn 

385 39G 895 



Ala Glv Leu Ala Asn Lev 



G 1 n Leu Asn Tie 3 1 y 3 e 1 r Ala Ser Cy s 



Ser Ala Phe 



Cys Gly Sc; Ala St;: A l; GIl £ 
930 935 

IT FORMATION 70?. 5SC I" NC ; 2 0 C : 



u - y be: - v a _ oe: u * u 
940 



SEQUENCE CHARACTERISTIC 



(A 
(3 



^z^i ^ . n : ^ j oase pairs 
Ti'PE . nuclei - acid 



,u-wo. . _ i near 



3E0L T E1JC_ L E3CE I PT I IL' : 



70RMATICN 7 CP 



. _ J ■ j - .. J _ T.J _ 



:hmat : 
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■;xi. SEQUENCE DESCRIPTION: SEC ID NG:2C2: 

3GATCCTGCA GGCTCG AAA C CACCGAGCGG T 

2] INFORMATION FOR SEQ ID WO: 2 03: 

(l) SEQUENCE CHARACTERISTICS: 
{A; LENGTH: 31 base pairs 
(B; TYPE : nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

;xi; SEQUENCE DESCRIPTION: SEQ ID NO:203: 

ITCTGAATTC AGCGCTGGAA ATCGTCGCGA T 

: INFORMATION FOR 3EC ID NO: 1 04: 

■1 SEQUENCE CHARACTERISTICS: 
(A: LENGTH: 33 base pairs 
(3/ TYPE: nucleic acid 
(C) STRANDEDNESS : single 
D) TOPOLOGY: linear 

SEQUENCE DESCRIPTION: SEQ ID NC.234: 

- — ■< \ 1 j ^. o- _ IGAGri GAA GACCGATGCC GCT 

INFORMATION FOR SEQ ID X AC : 2 2 5 : 

: SEQUENCE CHARACTER. I STI CG : 
A LENGTH: 3b 3ase "airs 
H TYPE: -ucieic zciz 
T STRANDEDNESS : : r.^ . e 
F TOPOLOGY- linear 

Ml SEQUENCE DESCRIPTION: SE" ^ 



:nof character: st : 

LENGTH: iC, base 
STRANDEDNESS 
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SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3~ base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : Single 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NC:2C7: 

T CATGG AA TTCTCAGGCC GGTAAGGTCC GCTGCGG 37 

: INFORMATION FOR SEQ ID NO: 208: 

SEQUENCE CHARACTERISTICS - 
(A) LENGTH: "67 6 base pairs 
(3) TYPE: nucleic acid 
;C) STRAND 1 ED NESS : s;ngl- 
■ D ) ^OPOLCHv ■ ' ■ ^«c".r 

XI. SEQUENCE DE3CRIPTI2N: SEC ID \*C:2 03: 

] CGTGACO 3CTACACTTS SCAGCSCSCT AGC3DCCGCT 2CTTTCGCT7 TCTTOCCTTC 12 T 

. ^^^^.o^- ACClTCGCC^ aCTTTISCCS 70AAGCTCTA AATCGGGGG2 TC2CTTTAGG ISC 

— GATTT ACTGCTTTAC SGCACSTIOA CCCCAAAAAA CTTGATTAGG STGATGCTTC 24 0 

^TAGTGGG CCATCGCCCT GATAGACCC7 TTTTCGCCCT TTGACGTTGG AGTCCACGTT 3 3C 

"TAnTAGT JGACTCTTGT ICC. AAA CTGC AACAACACTC AACCCTATCT -GGTCTATTC 160 



AAAT 0 A AA 



. - ~ - - - - - ^ ^ - *- vA^ \ .■ \ATAAGGCTA 



WO 99 42 1 1 8 



148 
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TGGTGAGTAA GGATGCATCA TGAGGAGTAG GGATAAAATG GTTGATGGTC GGAAGAGGGA 114 C 

TAAATTCGGT GAGCCAGTTT AGTCTGACCA TCTGATCTGT AACATCATTG GCAACGCTAC i:0C 

GTTTGCGATG ITT GAG AAA C AACTCTGGCG CATGGGGCTT CGCATACAAT GGATAGATTG 12 6 0 

TCGCACCTGA TTGCCGGACA TTATCGCGAG CCCATTTATA CGGATATAAA TCAGCATCCA 132 0 

TGTTGGAATT TAATCGCGGC CTAGAGGAAG ACGTTTCCCG TTGAATATGG CTCATAACAC 13 8 0 

CCGTTGTATT ACTGTTTATG TAAGCAGACA GTTTTATTGT TCATGAGCAA AATCCGTTAA 144 0 

;:gtgagtttt cgttccactg agcgtgagac ccggtagaaa agatcaaagg ATCTTCTTGA 15 0? 

gatggttttt ttctgcgggt aatgtggtgc ' 7tg caaa c aa aaaaac gag g gctacgaggg 156 0 

" m GGT""'T"GT" T ' 7GCCGGATCA AGAGGTACCA ACT CTTTTT G GGAAGGTAAC TGGCTTCAGC 16 2 0 

AGAGGGGAGA TAG "AAA TAG TGTCGTCTA GTGTAGCGGT AGTTAGGCGA G GA GTT GAA G 163 C 

^AC.CTGTAG CACZGCCTAC ATAZCTCGZT" CTGGTAATGG TGTTAGGAGT GGGIGCI' GG J 174 0 

AGTGGZGATA AGT2GTGTCT TAGGGGG7TG GAGTCAAGAG G AT AG IT AC C GGATAAGGGG 130 0 

GAGGGGTGGG GGTGAACGGG GGGTTZG7GG A~ACAGG~GA GGTTZGAGGG AACGACCTA1 186: 

Av_G^f-uA^TGrt vjATACCTACA GCGTGAGCTA TGAGAAAGCG GCACjCTTCC CGAAGGGAGA 192 0 

-A.GGGGGACA GGTATCGGGT AAGGGGGAGG GTCGGAACAG GAGAGGGGAC GAGG-GAGCT7 ! 9R0 
OGAGGGGGAA AC3CCTGGTA TCTTTATAGT G2TGTCGGGT TTCG ZCACCT 



t\a 



jAGCCTAT 



U » ' J ^— O i. . 
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C7GAT GCCTCG3TGT AAGGGGGATT TCTGTTCATG GGGG T AA T G A TACGGATGAA 2^50 
AG3AGAGAGG ATGCTGAGGA TACGGGTTAC TGATGATGAA CATGCCCGGT TACTGGAACG 2 82 C 

T"**GTGAGGGT AAACAAG7GG GGGTATGGAT GCGGGGGGAC CAGAGAAAAA TCACTGAGGG 2 88C 

TCAATGCCAG CGCTTGGTTA ATACAGATGT AGGTGTTCCA CAGGGTAGCC AGCAGCATCC 2 94 0 
TGCGATGCAG ATCCGGAACA TAATGGTGCA GGGCGCZGAC TTCCGCGTTT CCAGACTTTA 
CGAAAGACGG AAACCGAAGA GCATTCATGT TGTTGGTCAG GTCGCAGACG TTTTGCAGCA 
3CAG~CuCT7 CACGTTGGCT CGCGTATCGG TGATTGATTC TGCTAACGAG TAAGGCAACC 
GGGCCAGCGT AGCGGGGTCG TCAACGACAG GAGCACGATC ATGCGCACCC 3TGGGGGCGC 



3QOO 
3060 

3 : : o 
3 ;ao 

L'GACGAA 3 2 4 0 

GGCTTGAGCG AGGGCGTGGA AGATTCC3AA TACGGCAAGC GACAGGCGGA 7CATZ3TCGC 3 300 

Z G7CGAGCGA AAGCGGTGC7 CGCCGAAAAT GAGGGAGAGC GCTGCGGGGA GG7GTGGTAC 336C 

GAG7TGGA7G A 7 .AAA G AA G A CAGTCATAAC TG GGGG 3 AC 3 ATAG7CA7GC CCCGCGC CCA 342 C 

GGGGAAGGAG C7GAC7GGG7 7GAAGGC7CT GAAGGGGATG GGTCGAGATG CCGGTGCCTA 34 3 C 

A7GAG7GAGC T AA C7 7 A CAT TAA7TGCG77 GGG G7GAG7 3 CCCGC77TGC AGTCGGGAAA 3540 

.• _ - - ^ j - '..."l-j. * . .-vn.^-A^r^ju ^^-.-w\Cw* _GCG 'GGGAGAGGGG GTTTGCGTAT 3631' 

7GGGGGGGAG }G^33TT™ 777*77 CA CCA JTGAGA 3 GGG GAACAGC7G;-. TTGC2GTTCA 5o6 3 

> .. ••■ -T^GAGAG7 "GGAGGAAGG :-G7'GCA_'GC7 JG777"GCCGr. GGGAGGCGAA 3^3 

---- . . . . -w^.GGCGGGA 2ATAACA7GA JCTG7C77GG J7AT3G7GG7 ; " 3 3 

.i Av-TAG GGA jATATGG GGACGAACGC 3GAGGGCGGA 2TCGGTAA7G JCGCGCAT7G 3 84 0 



v ^ i : ^-rt _ - „ ,rtrA - - j _ w j . rtrt'^. 



.GAT 
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GGGCGAGAGT GGAGGTGGCA ACGGGAATCA GCAACGACTG TTTGGGGGGG AGTTGTT3TG 
ZZAZZCZZTT GGGAATGTAA 7T GAGGTGGG CGATGGGGGC TTCGACTTTT 7GGG3GGTTT 
TGGCAGAAAG GTGGCZGOCZ TGGTTGACGA CGCGGGAAAC GG7GTGATAA GAGAGAGGGG 
CATACTCTCC GAGATGG7AT AAGGTTAGTO GTTTCACAT7 CAGCAGCGTG AATTGACTCT 
CTTCCGGGCG CTATCATGCC ATACGGCGAA AGGTTTTGCG CCATTCGATG GTGTCCGGGA 
TGTGGACGGT CTGGCTTATG GGAG7GGTGG ATTAGGAAGC AG GC GAG TAG 7AGG7TGAGG 
GGGTTGAGGA ZZGCZZCCGC AACGAATGGT GGATGGAAGG AGATGGGGCG GAACAGTGGG 
GGGGGGACGG GGGGTGCCAC GA T A C G GA G G GGGAAACAAG GGGTGATGAG G GGG AAG7GG 
:Z GGTGATGTG3 3GGATA7AGG GGGGAGGAAG GGGAGGT3TG 
-GA 7GCCGGCCAC GATGGGTCGG 3GG7AGAGGA TGGAGATGTG 3ATGGG3G3A 

:agg agtgagtata ggggaattgt gaggggataa gaattggggt gtagaaataa 
~taa g™aagaag gagatataga tatgggggat gatgatgatg atcacgtcat 
:atg zggaccazzz ggagatggtg ggaagagggg gggggggagg gggtggaggg 



gag; 



3GCGGGGGA7 AGGGTGGA7G AGATGGGGGT GGGTGGGGTG ATTGAGGAGG AGATGGGGGT 

jGAGAGGGGG gggaagatca gg-acgggat gaaggtggaa gtgtggttca AGATGAGGCG 



"GT 



GGG 



jHortU-jTAT? _3AA-. 



- . / -\ ^ w w- . 



gagggcacg: 



. iUu.wC "GGGATGGG" 



4440 

45CC 

4560 

4bGC 

4680 

4740 

48CC 

4860 

49:: 

49BC 
504 : 

5 : o o 

5 ISO 
52GC 
52 SO 

= i4 : 
;4C .■ 

34 b. 



--1U/1T GG -TGT 



tga: 



■GGG.- 



'GACGG A- 



sa:: 
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ACTGGGGGAG GGGGAACTAG GCAATAGGTG TGGGAATTTC TTGTTGGGGG ACGGGGAAAG 606G 

GATTGAGGGC GCZGCZGCZG GC7TCGCATC GAAAAC CCGG GCGAACCAGG CGATTTCGAT 612 0 

GATCGAGGGG ZCCZZCCCZG ACGGCTAGGG GATCATGAAC TACGAGTAGG GGATGGTCAA oi8C 

CAACGGGGAA AAGGAGGCGG GCAGCGCGCA GACCTTGCAG GCATTT GTG C ACTGGGCGAT 6 24 0 

GACGGAGGGC AACAAGGCGT CGTTCCTCGA GGAGGTTCAT TTCGAGGCGC TGGGGCGGGC 63 0 0 

GGTGGTGAAG TTGTGTGACG CGTTGATCGG GACGATTTCC AGCGGTGAGA TGAAGAGGGA 63 6 C 

TGGGGGTACC GTCGGGGAGG AGGCAGGTAA TTTC3AGCGG ATGTCGGGGG AGGTGAAAAC 642 2 

GGAGATGGAG GAGGTGGAGT GGACGGGAGG TTGGGGTGCAG GGGGAGTGGC GCGGCGCGGZ 648C 
"rGGGAGGGGG ZZZZ^ZZCZZ GGGTGGTG ~G ^^'Aaii ^Ci.GGGA_ATA AG GAG AA G CA " 

3GAAGT G3AG 3AGA~G~GGA GGAATATTGG TCAGG CGGGC GTGGAATAGT GGAGGGG 2GA -5r3GC 

G 3 A G G A G GAG CAGGAGG GGG TGTCCTCG2A AATGGGGTTT GTGGG GACAA GGGGGGGGTG 666C 

GGGGGGGTCG AC GG GTG -GAG ZGZZAZZZZZ AGCGG GGAGA CCTGTTGCGC GGCGACGACG 672C 

1GZZZZZZCZ AAGAGGGG3A ATGGGGAGGG GGGCGATCCC AAGGGAGGAC GTGGGCGGGC d^SC 

GGAGGGGAAC 3CACGGGGGG GACGTGTGAT TGCCGGAAAC GCACGGCAAG GTGTCGGGAT 584 C 

GGAGAAGGGG GTTGGAG GAT TGAGCTTG3G GGTGG GTGGT GGCTGGGTGG AGTCTGAGGG 6 90C 

GGGGGAGGGTC GAGTAGG GTT GAGGAGTG GT GAGGAAAA2G AGGGGGGAGG GGGGATTTGG -5960 

jCTTTA.GGGG .-iGGGGGjAAG G2AGG3A2TG Z^J\ZZZZZZZ GGGGG3TTGG GGTGGGAGAT "OH J 
GG GTGAGTTG TA.Ai.GGGGT AGGGGGGIAG GGGGATGAAC GAGGAAAGGG TGTCGCTTGA 

„~ j*^. - - --G/^m. T-GGGTGGTA TTAGGAAGTT AAGTTGAGGG .\TGGGAGTAA "2"- 

- • v_-.-\ ~ j .'GGG G G7A \ I' "GGG7GGG 2G iG "G 2GAAGG GAGGGGAC3G " 2 \ 

jj^-^jw^j^^^ .-irt-j _ . jCj GGGA-A7 G JAG' 2GGGGG7GGTG ^'7\1ZZ'Z'ZZZZ ZZZZZSjZ'ZZZ • 

:g;ja.:ga:.\ ::g-agag:g: -\ ; \; ga g v~: aggggg _tga gaatg'Gtgga jATATGGatg 
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1 INFORMATION FOR SEQ ID NO:209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 802 amino acids 

(B) TYPE: amino acici 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 09: 

Met Gly His His His His His His Val He Asp lie He Glv Thr Ser 

Pro Thr Ser Trp Glu Gin Ma Ala Ala GIu Ala Val Gin Arg \la Arc 
20 25 

Asc Ser Val Asp .Asp He Arg Val Ala Ai-g Val lie Glu Gin .^sp Met 
35 40 45 

Ala Val Asp Ser Ala Gly Lys lie Thr Tyr Arg He Lys Leu Glu Val 

50 55 go 



Ser ?he Lys Met Arg Pro Ala Gin Pro Arg Glv Ser Lvs Pre Pro Se^ 
^ 5 ^ 75 ' 8C 



Ser Pro Glu Thr Gly AI a Glv Ala Gly Thr Val Ala Thr Thr Pro 

35 go 95 

Ser Ser Pro Val Thr Leu Ala Giu Thr Glv Ser Thr Leu Leu Tvr 

10C "LZS ~ 1 " 



Val Thr 1 1 « Thr Ala Gin Glv Pnr Jlv ier llv Ala 
13 0 ; 3 5 — - 
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15} 



Asp Thr ?he Leu ?he Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly 
245 250 255 

Trp Gly Lys Ser Pro Gly Phe Gly Thr Thr Vai Asp Phe Pro Ala Val 
260 ?65 270 

Pro Gly Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys 
275 280 285 

Ala Glu Thr Pro Gly Cys Val Ala Tyr lie Gly lie Ser Phe Leu Asp 
290 295 300 

Gin Ala Ser Gin Ary Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser 
305 110 315 320 

31'-' Asn °he Leu Leu >o Asp V: -i "V n '-^^ ' 1 ** Gl - Vi ^ : ' 

325 330 335 

Gly Phe Ala Ser Lys Thr Pro Ala Asn Gin Ala lie Ser Met lie Asp 
34C 345 350 



Gly Pro Ala Pro Asp Gly Tyr Pro 
355 360 

Val Asn Asn Arg Gin Lys Asp Ala 

3-0 3^5 

Phe Leu His Trp Ala lie Thr Asp 

3 5 5 3 9 C 



lie lie Asn Tyr Giu Tyr Ala lie 
365 

Ala Thr Ala Gin Thr Leu Gin Ala 
380 

31 v Asn Lys Ala Ser Phe Leu Asp 

395 400 



Gin Val His Phe 3in Pro Leu lr: Pr^ A^a 7a! Va. ^ys L,eu Ser Asp 

4 0 5 -i 1 1 41- 

Ala Leu lie A^a Thr l^e Ser Ser A^a Giu Met ^ys Thr «isp ALi A \ a 

12 5 -30 

Thr Leu Ala Gin Glu La Gly Asn Phe Glu Ar^r lie Ser Gly Asp Lei: 
;25 ;4.; 44- 

G 1 r. ".'ru Ar _• ^ -\ - » A 1 \ 3 . ' ' i \ - " ' V. . s A . i 'Aj . '.'a . Ar t 

■; '3 = 4 "' ." 4 " - 4 3 0 

Pne Gin G- Al a All A s n Ly n G * n lv" G 1 i 1 u Leu Asp Glu lie G e r 

4 ? 5 4 9 4 1* 5 
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Ala Ser Pro Pro Ser Thr Ala Ala Aid Pro Pro Ala Pro Ala Thr Pro 
530 535 540 

Val Ala Pro Pro Pro Pro Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro 
545 550 555 560 

Gly Asp Pro Asn Ala Ala Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro 
5 66 5 70 5" 5 

Pro Pro Val He Ala Pro Asn Ala Pro Gin Pro Val Arg He Asp Asn 
580 585 590 

Pro Val Gly Gly Phe Ser Phe Ala Leu Pro Ala Gly Trp Val Glu Ser 

5^5 600 605 

Asp Ala Ala His Phe Asp Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr 

^ : c 515 ^ ? o 

Gly Asp Pro Pro Phe Pro Gly Gin Pro Pro Pro Va. A.a Asn Asp Thr 
o*= *3C 635 54C 

Aro V-il Gly Aro Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu 

645 65c 655 

Ala Thr Asp Ser Lys Ala Ala Ala Arg Leu Gly Ser Asp Met Gly Glu 
™ 3 o 6 5 670 

Phe Tyr Me: Pro Tyr =ro rT. y Thr Ar.-j :ie Asn Gin Glu Thr Val Ser 
5"5 63: 635 

Leu Ago Ala Asn Gly Val Ge: G y Ser -\ . a S<-r Tyr Tyr GI:j 7a ; Lys 
S?C 695 -00 



-he Jer Asp D r; Ser Lyr> Pre As n Gl\ 



. i-- . ::: j*y -a. 

* 1 ^ 



3^y Ser ?r: A^a Ala Asn Ala Pr.: Asp Ala Sly ?r: ?u Gin Ai - 7m: 
^5 -30 * -35 



1 pr - A: J Ala [ r:. A.j Pre Ala Pro A. a ?r: Ala 
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(l) SEQUENCE CHARACTERISTICS: 
(A, 1 LENGTH: 454 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Genomic DNA 

Ui) SEQUENCE DESCRIPTION: SEQ ID NO:210: 

GTGGCGGCGC TGCGGCCGGC CAGCAGAGCG ATGTGCATCC GTTCGCGAAC CTGATCGCGC SO 

TCGACGATGA GCGCGCCGAA 2CCCGCCACC ACGAAGAACG TCAGGAAGCC GTCCAGCAGC 12 0 

GCGGTCCGCG CGGTGACGAA GUTGACCCCG TCGCAGATCA GCAGCACCCC GGCGATGGCG 18 0 

CCGACCAATG TCGACCGGCT GATCCGCCGC ACGATCCGCA CCACCAGCGC CACCAGGACC 24 C 

ACACCCAGCA SGGCGCCGGT GAACCGCCAG CCGAATCCGT TGTGACCGAA GATGGCCTCC 3 00 

CCGATCGCGA TCAGCTGCTT ACCGACCGGC GGG TG AAC CA CCAGGCCGTA CCCGGGGTTG 3 60 

TCTTCCACCC CATGGTTGTT CAGCACCTGC CAGGCCTGGC GGTGCGTAAT GCTTCTCGTC 

3AAGATGGGG GTGCCGGCAT CCGTCACCGA GCCC 45.; 

: :nformattcn :cr seq id no : z 1 1 : 

: : sequence characteristics: 

;A; LENGTH: 4^0 base pairs 

;B; TYPE: nucleic acid 

'C STRANDEDNESS : single 

<D'< TOPOLOGY: linear 

.11 MOLECULE TYPE: Genomic DNA 

>xi SEQUENCE DESCRIPTION: SEQ ID NO : 0 1 I : 




CGTCTC^ SCOATGGGGG * Z 



Z 4 0 
30C 
3 60 
40C 

; ■: 




-OACCTGCTG 

wCTGGACAT 3CTGCCTAC0 SCC3GTGAAC SCAT 

rCGACTCGCT ICG CSC SCAT SCCCGGTCST TCACC 

:CC3CACCCA IGGCAACCCC AAGATCAT0 S ACGTCACGCC GGGGCGGCTG SAAACCGOCC 

"TGAGGAAGG SCGGGTCGTC ITGGTGCCCG GATTCCAAGG GGTCAGCCAG SACACCAAGG 

information zzz seo ;d no . :: z 
seqce::ce 'hapactee: st: o; 

L ENG TH . I " ? 03^ 0 n a ; r ~ 
S TV ? E . r.u:.e:: a c 1 a 

O IPAND0 EE ME o S : :.r.q i -: 
0 TOPOLOGY: linear 

: : MOOSOCOE EYPE. Je-cn; .: DNA 
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CCCAGCTC. CAAGGACGCG GAGAGCGATG AAGTCTTGGG CAAAATGAAG GTGTCTGCGC 240 
TGCTTGAGGC CTTGCCAAAG GTGGGCAAGG TGGAGGCGC 2^9 

(2; INFORMATION FOR SEC ID MO: 2 13: 

(i) SEQUENCE CHARACTERISTICS . 
(A) LENGTH: 219 base pairs 
(3) TYPE: nucleic acid 

(C) ST RAND ED NESS : single 

(D) TOPOLOGY: linear 

(11? MOLECULE TYPE : Genomic DNA 

\Xi: SEQUENCE DESCRIPTION: SEQ ID NO:213: 

ACACGGTCGA ACTCGACGAG CCCCTCGTGG AGGTGTCGAC CGACAAGGTC jACACCGAAA 6 0 

"CCGTCGCGG GGGGGGGGTG TGCTGACCAA GATCATCOCC CAAGAAGATG ACACGGTCGA 12 0 

^g-cggcggg GAGCTCTCTS TCATTGGCGA cgcccatgat gggggggagg zgggggtggg ;bc 

(2) INFORMATION FOR SEQ ID NO: 2 14: 

■i) SEQUENCE CHARACTERISTICS : 
i.Aj LENGTH: 342 base sairs 
3 TYRE: nucleic acid 
'C STRANDEDNESS : s:r.g:e 
'Di TOPOLOGY: linear 

MOLECULE TYPE: r^—,r- - -v.- 



SEQUENCE DESCRIPTION: GEQ TO NO : 21-1 • 
-ATCGGCGCC GCGCCCGCCC "CAAGCECSC ACTC 



■'^G. — i^_^jj-v : *_ uCCGAAGGC 



-j^^G w uCA I -ATCG' 'O^C *^GCGG^"~*CAG " CAG G^GG^' • ' " 

CGAGGG EGCACCCTAC JTGACOCCGC ~GOTOCGAAA 3CTOGCGTCG 3AAAACAACA 1BC 

2GG:GGG CGCGGTOACC GGCACCGGAG TGGOTGG~CG ^ATCCGCAAA IAGGATOTGC 2 4, 

_G^GGC T G AA CAAAAG AAGCGGGCGA AAGOACCGGC GCCGGCCOC7 CAGGCCGCCG iC. 

3CCCGC CCCGAAAGCG CCGCCTGAAG ATCCOATOCC IC j4i 
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CGGAAAACAA CATCGACGTC GCCGGGGTGA CCGGCACCGG AGTGGGTGGT CGCATGCGCA 
AACAGGATGT GCTGGCCGCG G CTG AACAAA AGAAGCGGGC GAAAGCACCG GCGCCCTGAC 
CG CTTCATCA CCCGGTTAAC CAGGTTGCCG CAGAAGCCGG CTTCGACC^C — -GCGGGT^ 
TTGGTCCGCT GCAGGGGGTC GGGGAGCCAG TTCAGGTTAG GCGGCCGAAA -C^CCAGT^ 
CGCCAGGAAG GGCACCCGGA ACAGGGTCCG CACCC 

(2) INFORMATION FOR SEC ID NO: 216: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 557 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

ID) TOPOLOGY: linear 

MOLECULE TYPE: Genomic ON A 

- SEQUENCE 000001 ?TI ON : SEC 10 .NU—b: 

TCGACC:CAA GGT3CAGATT CAACACGCCA TTGAGGAAGC ACAGCGCACE CACCAAGCG" 
TGA.TCAACA GGCGGCGCAA GTGATC3CTA ACCAGCGTCA ATTGGAGATG CGACTCAAC^ 
GACAGCTGGC GGACATCGAA AAGCTTCAGC TCAATGTGCG CCAAGCCCTG ACGCTGGCCG 1 
^CGGCCAC CGCCGCGGGA GACGCTGCCA AGGCCACCGA ATACAACAAG GCCGCCGA^ 2 
CGTTCGCAGC CCAGCTGGTG ACCGCCGAGC AGAGCGTCGA AGACCTCAAG ACGCTGCATG 
ACCAGGCGCT TAGCGCCGCA G CT CAGG C OA AGAAGGCCGT CGAACGAAAT GCGATGG^GC 
TGCAGCAGAA GATCGCCGAG CGAACCAAGC TGCTCAGCCA GCTCGAGCAG GCGAAGATGC 
AGGAGCAGGT CAGCGCATCG TTGCGGTCGA TGAGTGAGCT CCCCGCGCCA GGCAACACG" 
CGAGCCTCGA CGAGGTGC3C GACAAGATCG AGCGTCGCTA CGCCAACGCG ATCGGTTCGG 
CGAGAGT 

:nfc?;«!at:-m ?cr sk- :;: ::c:::-- 

JECUENCE CHARACTER 1ST: GG 
A LENGTH: ::■ :;ase :a;rj 
2 TYPE . r.u.ci~::: ; 

0 TOPOLOGY, linear 

: MOLECULE TYPE: :;:.*A 



300 
36C 
420 
430 
515 



3C 
4 C 
3G0 
3h 0 
4 2 0 

4a c 

54 Q 
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IXK SEQUENCE DESCRIPTION: SEQ ID NO : 2 1 8 : 

AAGAAGTACA TCTGCCGGTC GATGTCGGCG AACCACGGCA GC3AACCGGC GCAG TAG CCG 60 

ACCAGGACCA CCGCATAACG CCAGTCCCOG CGCACAAACA TACGCCACCC CGCGTATGCC 120 

AGGACTGGCA CCCCCAGCCA CCACATCGCG GGCGTGCCGA CCAGCATCTC GGCCTTGACG 180 

CACCACTGTG CGCCGCAGCC TGCAACGTCT TGCTGGTCGA TGGCGTACAG CACCGGCCGC 240 

AACGACATGG GCCAGGTCCA CGGTTTGGAT TCCCAAGGGT GGTAGTTGCC TGCGGAATTC 3 00 

GTCAGGCCCG CGTGGAAGTG GAACGCTTTG GCGGTGTATT GCCAGAGCGA GCGCACGGCG 3 60 

TCGGGCAGCG GAACAACCGA GTTGCGACCG ACCGCTTGAC CGACCGCATG CCGATCGATC 4 20 

GCGGTCTCGG ACGCGAACCA CGGAGCGTAG GTGGCCAGAT AGACCGCGAA CGGGATCAAC 4 80 

CCCAGCGCAT ACCCGCTGGG AAGCACGTCA CGCCGCACTG TTCGCAGCCA CGGTCTTTGC 54 0 

ACTTGGTATG AACGTCGCGC CGCCACGTCA ACGCCAGC 573 

'2; INFORMATION FOR SEQ ID f JO : 2 1 9 : 

1: SEQUENCE CHARACTERISTICS: 
■ A J LENGTH : 4 84 oase pairs 
3 TYPE: nucleic a c 1 a 
0' STRANDEDNESS : single 
( D ) TOPOLOGY: linear 

.11. MOLECULE TYPE: Generic DNA 

.xi SEQUENCE DESCRIPTION: SEQ ID NC:219: 



ACAACGATCG A. .GATATCG ATGAGAGACG GAGGAATCGT GGCCCTTCCC CAGTTGACCG 
ACGAGCAGCG 33CGGCCGCG TTGGAGAAGG CTG3TGCC3C ACGTCGAGCG CGAGCAGAGC 



GCTCAAGC 



3ATGAAG . CGGGCAAA ATGAAGGTGT CTGCGCTGCT TGAGGCCTT3 CCAAAGGTG 
P\AGGTCAA 3GCGCAGGAG ATCATGAC 



330TGGCCT3 3GTGAC3GT3 AGCGCAAGGC 3CTOCTGGAA AAGTTC3GCT 3333 
3C0CCGGCCG ACGATGCGGG 3CGGAAGGC3 TGTGGTGGGC 3TA303C333 ATAC 
; rtAc w ^ o w _ T 3 ACAGGGC CA • 3T CAC AA 



INFORMATION FOR SEQ ID :JO : Z 2 0 ■ 

2'JENCE CHARACTERISTICS 
3ZNGTH . 13" ::ase :\i . • ; 

: CPA! ID ED NE 3 3 j ; : : « . - 



60 

12C 



A COT CAC OCA GOTCCTCAAG GACGCGGAGA 18C 



GG 240 

GGAAAT 7 SCO 33 CC AC 33C3C3GCCT jQO 

T Aj-\ 33 3 60 

o oCG A -l 2 0 

■ — _ il-j^ o'o j C 3 3 3 4 ti L' 
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ACTTGGTACT GACCTCGCGC CGCCACGTCG AACGCCAGCG CCATCGCGCC GAAGAACAGC 48 C 
ACGAAGTACA CGCCGGACGA CTTGGTGGCG CAAGCCAATC CCAAGCAGCA CCTGGGGC 53 - 

(2 J INFORMATION FOR SEQ ID ^0:221. 

i ) SEQUENCE CHARACTERISTICS ; 
(A) LENGTH : 13 5 amino acids 
(Zj) TYPt: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

in; MOLECULE TYPE : protein 

;xi; SEQUENCE DESCRIPTION: SEQ IE NO: 221: 

G.y Gly Aid Ala Ala Gly Gin Gin Ser Asp Val His Pro ?he Ala Asn 

5 15 

Leu I!- Ala Val Asp Asc Glu Arg Ala Glu Arg Arg .\sd Asd Glu Glu 

20 25 " .3 0* 

Arg Sir. Glu Ala Val 31:: Gin Arg Gly Pro .Arg Gly Asp Glu Ala Ass 

35 40 45 

Pic Val Ala Asp Gin Gin His Pr:3 Gly A~r Gly Ala Ago Gin Cvs Ar^ 

5- 55 6C 

Pro Ala .Asp Pro Pro His Asp Pro His His Gin Arg His Gin Asp His 

V — * 70 " 5 80 

Inr Gin Gin Gly Ala Gly Glu Pro Pro Ala Glu Ser Val Val Tn^ G 1 - 

35 90 95 

\3p Gly leu Pro Aso Arg Asp Gin leu Leu Thr Aso Arc: Arg Val Asn 

100 105 no 

Ala 7ai ~ lv '-'a- Val Phe Hu Pro Met: Val Val Gin Hi:; 

: 15 ::c :25 



INFORMATION FCR GEO IE IIC : 221. 

L SEQUENCE CHARACTERISTICS: 
'A, LENGTH; 15o amino acids 
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Pro 

Ala 

Gly 
145 



7G -5 8C 

Ser Leu Gly Ma His Ala Arg Ser Phe Thr Gly Ser Gin Ala Gly 

85 90 95 

lie Thr Thr Gly Thr His Gly Asn Ala Lys lie Tie Asp Val Thr 

100 105 no 

Gly Arg Leu Gin Thr Ala Leu Glu Glu Gly Arg Val Val Leu Val 

115 120 125 

Gly Phe Gin Gly Val. Ser Gin Asp Thr Lys Asp Val Thr Thr Leu 



130 



135 



140 



Arg Gly Gly Ser Asp Thr Thr Ala Val Ala Met: 



150 



155 



INFORMATION ?OR SEQ ID NC:223: 



SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 92 ammo acids 
TYPE: amino acid 
STRAINED NESS : 3inq;e 
TOPOLOGY : linear 



MOLECULE TYPE: protein 

SEQUENCE DESCRIPTION: SEQ ID NO: 223: 



Ala Tyr Pro Ala Gly Thr Asn Asn Asp Arg Leu lie Ser Met Arg 

^ 10 15 

' ' a - Ala Leu Pre Jin Leu Thr Asp Glu Gin Arg Ala 

2b 3C 

eu Glu Lys Ala Ala Ala Ala Arg Arg Ala Arg Ala Glu „eu 

5 4C 4b 



~> n 



vsi: nrq ..ei; Lvs Ar 



^eu 



. a i j e r A x c 



^formation :-c 
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Gly Asp Ala His Asp Ala 31 y Glu Ala Ala Val Pro Ala Pre Gin Lys 

50 55 60 

Val 3er Ala 31 y Pro Thr Arg lie 

5 5 ?C 

(2) INFORMATION FOR 3EQ ID NO: 22 5; 

( j ) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 113 ammo acids 
(Bf TYPE: ammo acid 
(C) STRANDEDNESS : single 
(D! TOPOLOGY: linear 

<H; MOLHCLLLE TYPE: protein 

XI- SE<2LTENCE DESCRIPTION: SEQ ID NO: 225: 

Ala Ala Asp He 31 y rt la Ax a Pro Ala Pro Lys Pro Ala Pro Lvs , J ro 

~ 2 z 1 5 

Val Pro Glu Pro Ala Pro Thr Pro Lys Ala Glu Pro Ala Pro 3er ?ro 

20 25 3U 

Pro Ala Ala Glo Pro Ala G'y Ala Ala Glu Gly Ala Pro Tyr Val Thr 

35 40 45 

Pro Leu Val .Arg Lys Leu Ala Ser Glu Asn Asn He Asp Leu Ala Gly 

50 S5 60 

Vdi Till Gly Tnr Giy Val Gly Gly Arg lie Arg Lyn Gin Asp Val. Leu 
53 -3 ^5 BC 

Ala Ala Ala Glu Gin Lys Lys Arg Ala Lys Ala Pro Ala Pro Ala Ala 

35 9C 95 

Gin Ala Ala Ala Ala Pro Ala Pro Lys Ala Pro Pro Glu Asp Pro Me: 
100 105 



1 INFORMATION POP SEQ ID N . .22: 

SEQLUNCE 2PLARACTERIGTI 32 
A: LENGTH ; 115 ammo acids 
3) TYPE, ammo acid 
"I' 3 TRANCED NEC 3 amql-j 
; 72 P3L3GV . inear 



.a. ^er _e ^: A^a Asp 3lu Aoo Ala Thr Val ?r: Val 21 
,eu A* a Arg 11- o 1 y Val \.a A. a Aou Gly A 1 a A. a r : 



WO 99 42 MS 



PCT l'S l )9 03265 



65 "0 75 80 

Glu As n Asn He Asp Leu Ala Giy Val. Thr Giy Thr Giy Val Giy Glv 

85 90 95 

Arg lie Arg Lys Gin Asp Val Leu Ala Ala Ala Glu Gin Lys Lys Arg 

100 105 110 

Ala Lys Ala Pro Ala Pro 
115 

(2) INFORMATION FOR SEQ ID NO: 22 7: 

U) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 185 amine acids 
;3) TYPE: amino acid 
■;C; 3 TRANCED NESS : single 
!D) TOPOLOGY: linear 

'Hi MOLECULE TYPE: orocein 

xi. SEQUENCE DESCRIPTION: SEC ID HO:;!™: 

Asp Pro Lys Val Gin lie Gin Gin Ala He Glu Glu Ala Gin Arg Thr 

1 5 10 t ^ 

His Gin Ala Leu Thr Gin Gin Ala Ala Gin Val He Giy Asn Gin Arg 

— 25 30 

Gin Leu Glu Met Arg Leu Asn Arg Gin Leu Ala Asp lie Glu Lys Leu 

35 40 45 

Hr. Vd. Asn Vai Arg Gin Ala Leu Thr Leu Ala Asp Gin Ala Thr Ala 

50 55 so 

A_a Giy Asp Ala A^a Lys Ala Thr Glu Tyr Asn Asn Ala Aia Glu Ala 



30 



-ne .-^u A. a Gin ^eu Vai Tnr AH Glu Gin Ser Vai Glu Asn Leu Lvs 

S " ?5 

. hr _eu ,-t i s Asp G^n Ala Leu Jer Aia A- a A-a Gin A^a Lys Lyc Aia 

- a._ G~u Arg Asn Ha Met: 7a ^ Leu Gin G^:: Lvs Aj Ha G'u Arg 

-:' 3 ~ eu - e - Jer J — -eu 3Iu Glr. A.j Lys Met Ar. Siu J.r. Vai Ser 

ljJ H5 Me 

-i 3er Leu An jer Met Ser ;iu Leu A. a Aia Gr~ liv Asn Thr ^ro 



: 3 " : . 

:. infgpmatton fg? gl. ; 
sequence characte? i st i gg 

A LENGTH : "1 arr : r. i ~ : i - 
3 TYPE am::u: a j : ; 
G STRAND ED NESS 



WO 99/42118 f > CT i S9Q03265 



.Xl; SEQUENCE DESCRIPTION SEC ID N r C : Z 2 9 : 

7a I Ser Thr Ser Thr Trp 7a 1 Pro His Pro Val Arg Asp Arg Val lie 

5 10 15 

G 1 y Gin Arg Trp Thr Cys Ala Asp Arg Arg Ser lie Glu Glu Ser Thr 

20 25 30 

Glu Met Ala Phe Ser Val Gin Met Pro Ala Leu Gly Glu Ser Val Thr 

3b 40 45 

Glu Gly Thr Val Thr Arg Trp Leu Lys Gin Glu Gly Asp Thr 7a 1 Glu 

50 55 60 

Leu Asp Glu Pro Leu 7ai Glu 
55 "0 

;2) INFORMATION FOR 3EQ ID NO : 2 2 9 : 

.1) SEQUENCE CliARACTERIGTICS : 

■.3' TYPE: amino acid 
C; 3TRANDEDNES3 . smqle 
■'□i TOPOLOGY; linear 

;ii; MOLECULE TYPE : croceir. 

xi SEQUENCE DESCRIPTION': SEC ID :;C:229: 

Glu 7al His Leu Pro 7al Asp Val Gly Glu Pro Arg Gin Pro Thr Gly 

I 5 10 15 

Ala 7al Ala Asp Gin Asp His Arg lie Thr Pro Val Pro Ala His Lys 

20 2b 30 

His Thr Pro Pro Arg 'al Cys Jin Asp Trp Hio Arg 3Lo. Pro Pro His 

3 5 4 C A 5 

Ar~ 31 y Arg Ala 3 In H_j ^eu Gly Leu Asp A; a Arg Leu Cys Ala 

5 0 - - n 0 

A -.a \ la Cys Asn Val Leu Leu '.'a. Asp 3 1 y '.'a .11 H^j Arg Pro Gin 

-;5 " ^ 3C 

Arg His Jly Pro- Gly Pro Arg Phe 31 y Ph« Pro Ai g Val Val Vai Ala 

35 90 05 

Ovs Gly lie Arg Sin Ala Arg '/a. 01 7a'. G 1 ; ; Arg Phe Gly Gly Val 

i : i ; o * ■ i o 

Pro l^o A. i . v '.'a V; y • '. : \r : As;; Asn Aro '.'a. A^.a 



A: : '. . -. : A: : 1 . v Veu \ \ ' A r ' : 

: ; ; 

'3 '.'a V 3. 3.:. r 1 - A:;;, A: t A , Aro Aso 3 1 ;i Pr " 

* 5 : i = f. I6i 

- ?r : Al.i J 1 • . L. • H:;; Vctl Thr ?r -. Hi.; Cys Ser Gin Pro 

'65 1 T "5 

>u H-s Leo Val 
1 3 :: 



WO 99/42118 



VC V I S99 03205 



;C: S TRANTO EDNE S 5 . single 
!D) TOPOLOGY: linear 

MOLECULE TYPE ; protein 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 230: 

Asn Asp Arg Leu lie Ser Met Arg Asp Gly Gly lie Val Ala Leu Pro 

15 10 is 

Gin Leu Thr Asp Glu Gin Arg Ala Ala Ala Leu Glu Lys Ala Ala Ala 
20 25 30 

Ar S ^9 A^a Arg Ala Glu Leu Lys Asp Arcr Leu Lys Axu Glv 31 v 
35 ^ 45 

Thr Asn Leu Thr Gin Val Leu Lys Asp Ala Glu Ger Asd Glu 7a ' Leu 

50 55 50 

Gly Lys Met Lys Val Ser Ala Leu Leu Glu Ma Leu Pro Lys Val Glv 

-.■a _v/ S .^a S^n lie Met Thr Glu Leu Glu lie Ala Pro Hi u 

^ yC ?5 

?r: rt - d ^ d ?he Val Ala Ser Val Thr Val Ser Ala Arc: Pre Cvs 

ioo :05 ' 1:o ' 

a ^ „ P^v- *-j - y -fOTG KTg Cys G.'/ Pro Glu 

115 120 i25 

Gly Leu Trp Trp Ala Tyr Pro .Arg He Arg Glv Arc: Ser Glv Leu 



14 ~ 



14 G 

' Glv Thr Arc 



Pr^ Ala His Asn Ser Gly Arg Thr Pro Arc: Trr: 



* C5 0 



; 5G 

::■ INFORMATION FOR SEQ 10 ;;o : J 1 : . 

SEQUENCE CHARACTER IGTTCG ■ 
A ■ LENGTH . ; 7 3 amino -i c; : a s 
3 TYPE: amine icid 
G' GTRANDEDNEG3 . Single 
0 TOPOLOGY . linear 

; MOLECULE TYPE - oroteiu 



WO 99/421 18 



PC T I N^> (1326: 



-5 120 
Val Ttr Pro His Cys Pro Gin Pro 

130 135 
Ser Arg Arg His Val Glu Arg Gin 
1-15 150 
Glu Val Hi 3 Ala Gly Pro leu Gly 
165 

Pro Arg 



12 5 

Arg Ser Leu His Leu V<* I Leu Thr 
140 

Arg His Arg Ala Glu Glu Gin His 
155 160 
Gly Ala Ser Gin Ser Gin Aid Ala 
170 175 



i2; INFORMATION FOR SEQ ID NO. 232; 

>l. SECUENCH: CHARACTERISTICS : 
(A) LENGTH: I": oase pairs 
;3J TYPE: nucleic acid 
: C STEANDETNESS : Single 
- . -PC^CGV . . d. i 

ii MOLECULE TPE : Genomic ON A 

ixi- SEQUENCE DESCRIPTION: SH0 10 KG :232: 

ATGCCAAGCC GGTGCTGATG CCCGAGCTCS GCGAATCGGT GACGGAGGGG ACCGTCAT~C 

STTGGCTGAA GAAGATCGGG GATTCGGTTC AGGTTGACGA GCCACTCGTG GAGGTG~CCA 

CCGACAAGGT GGACACCGAG ATCCCGTCCC CGGTGGCTGG GOTCTTGGTC AGTATCAGGG HI 

ICOACGAGGA CGCCACGGTG CCCGTCGGOC 3CCA(™GGC — GGA— OCT -r— 
AGATCOGCCC CGCGCCCGCC CCCAAGCCC~ ? ^ 

2 INFORMATION FIR SE2 70 N7 ■ I 3 3 ; 

: GEQTENCZ CHARACTER! ST 7 05" ■ 
A LENGTH: 39 j^.inc dc:::s 
3 TYPE. a^::;c ic:c 
7 2TPANEEGNES3 . single 
2 TCP2LCCY. 7 ..rear 



J ^ ' /a - Ajc Ly:: 7a. A,-. Tn: G: : 7 1- ^- 

^' :jL ' Vj - • ^- 2er A..i Asp GI.; Asp A.« 



WO 99.421 IS 



rCT I S09 '(M26S 



U>6 



.'I; SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 107 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: Genomic DNA 

(Xl) SEQUENCE DESCRIPTION : SEQ ID NO: 234: 

GAGG7AGCGG ATGGCCGGAG GAGCACCCCA GGACCGCGCC CGAACCGCGG GTGCCGGTCA 
TCGATATGTG GGCACCGTTC GTTCCGTCCG CCGAGGTCAT TGACGAT 

information for seq ic xq-.z-s-. 

i) SEQUENCE CHARACTERISTICS : 
A; LZNGTII . 333 jdae pa. rs 
■3,i TYPE: nucleic ac:c 

STRAND EDNES 3 : single 
(D; TOPOLOGY: linear 

viij MOLECULE TYPE: Ger.CT,:; DNA 

ixi) SEQUENCE DESCRIPTION. SEQ ID NO : 2 3 5 : 



TGA 




CG CCTGAGTACT 3 C GAT AC TOO GTTGTGCAGC GGCGCTTGTC 



TACCCCGACG GCTCGTTTTG GCACCAGTGG ATGCAAACGT GGTTTACC3G CC CACAGTT" 

^TCAGCGG C.GTGAGCCC CTCCCCGGCC CGCCGCCACC GGGTGGt— C 

C-GT 3 GGGCAA 



::;"cr.yati:-.\ for ce; ic no.:: 6 



■ sequence characteristics 

■ M ; ^rCJG in: 1 ^ _ amino icilj 

Bi TYPE: -im:nc acia 

C; STRANDEDNESS: Jinq.e 

C' TOPOLOGY ::- ar 



MOLECULL 

3e;uhnce cms 

:-Vj', C.-:. ' eu Lys ?L~ Al i 

A_ i -Mi Ccu Val ?he ?;■ ; 



• — ^. .n..i ^eu cys a: -j 

* S*?r Yal C-r A. a A^r Pre Pro Aji 



SO 
L37 



50 



ICAG CGCAGATCCA CCTGACCCGC ATCAGC^GGA CA^GACGV " 
GGCTAT.GCC CGGGTGGCCG ATGGGGTTTT GGCGACTTGG CCGTGTGCGA CGGCGAGAAG 



iac 

24 0 

3 30 
33^ 



WO 99/42 11 8 



16 



VC'T l ; S9« 03:65 



90 95 



? " 3ly 3ly - /S ^' ^ Ala He Pro Ser Glu Gin Pro Asn A.a 
ioo 105 

: INFORMATION FOR SEQ ID ^0:237: 



Pro 



(U SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 37?. base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

MOLECULE TYPE : cDNA 



-xi; sequence description: seq id no : 237: 

GTGACCAOGG TGGGCCTGCC AOCAACOCGG GCAGCCGCAG — ,rr^, ~™ 

:;:r:r:r: ;^ GC3 - -,GGGTAACG GCACCGGCTC AGGCGGCAAG OGCGGCGCCG 
wi^.Ck, ^TGnTGGG AGCTTCGGCG OTA CCAGCGG CCCCGC— AT-GGG.— * 

;^:::;f aac ^cggcaagg ^c^c tcgcagcaac 

^:^? A ^CGGCAAA GGCGGCAACG GCGGTGCCGC CGGCAAOGGO GG—^CG 3 <n 

GA^C^G r GCA ^ GTC ™* ^GGGCCGC TGGCGCTGGC GGCGCCGGCG 



INF0R>!ATI0N FOP SEC ID NC 

: SEQUENCE 2HARACTERISTIG3 
(A; LENGTH : 424 base pairs 
«'3' TYPE: nucleic acid 
C , STRAITC EL NESS . a : n g 1 e 

L 1 T0POL3GY: linear 



: 238 




GTCATAGC 



SEQ IT NC:2JH: 

~ ^TAGCOGGC 
TGGTCGGCCA TGTTGGCATG ATCGTGAGCC 
■AAGGTCTAGC TC CATGCGAA TCGC 200200 

: '23" crrnr,A ^aggggtcgg TCGGCAACAT 
:atcgg:™g "-:gagg:j: aogaogatcg 



:n'e:pvat:,;:; for se v :y ncg3?: 

sequence lhapacteristi 22 
len: 

' TYrr 



:NGTH ■ ; ^ base pa : r : 



13C 
240 



J 6 C 
371 



WO 99/421 IS VCTVSV) i)Mh5 

'6S 



ixi ; SEQUENCE DESCRIPTION: SEQ ID NO: 23 9: 

GCGATGGCGG CCGCGGCTAC CACCGCCAAI 
TTG CATCTGG CGTCAATTGA CTTCAGCCCG 



CGGCTCTACG ATTCGTC 

(2) INFORMATION FOR SEC ID NO: 24 0: 

(:) SEQUENCE CHARACTERISTICS: 
(AJ LENGTH: 422 oase pairs 
fBi TYPE : nucleic acid 
.C: STRAND EMESS: single 
it) TOPOLOGY: linear 

-1. MOLECULE TYPE: cDNA 

;xi: SEQUENCE DESCRIPTION 



GTGGAACCCT 


TTCCCAACCC 


CAACGATCCT 


60 


GCCGATTTCG 


TCACCGAGGG 


CCACCGTCTA 


120 


GACCGGCTGC 


CTTTCGCCGA 


SCZGCZGGA* 


180 


ACCGTCACCG 


CCGACACGGT 


GCGCATCGAC 


240 


GCGGCGGCGT 


CCAAACTCAC 


CGAATCGCTG 


300 

31? 



SEC ID NO : 24 0 : 
- <-rL-iL\j^GC2 GGAGGCAATC GC1 



.GGCGTATGC GCTTCGCAGC CGGTGCCGC 

2GAGGAATGG TTCGATCACG ATCGCAGTGT GCCGTCGTGC ACCGACAC-G ™ C~AAC- 
7GAACTGAGG GCGG AAAATC GGC^GAAAT- — — ^Z"tzt — ^;UC. _ C 
_ _____ — ^—*C^ T.CACGCTCG GCGCCTAACG ifin 

j^-_*ooArtU . ^GGGTGCGC GC^'^rrr*^ - , ^ ^^^^ „ 

24G 



.GGAAG ACCTTGATGC CGAT 



GC..CTCGGC GAACGCGCGC GGGCCTTCCT TGGCGTCCTC 



.CAGTCGGTC TCGCGGATC5G IcC^X ™^ 3 °° 

GATGGCGTCG 3CAAGTTG""A. GA"— ^Z^r' - . .AGGCGA 360 

"GTGGGGCA CACGTGGCCG 4 2 "! 



— — J j 



INFCR.V 



JEQUENC7 DMARACTERIST: 
A LENGTH . 425 - d ye na: 
3' TYPE: nude:.: iC \H 
2 STRAND ED NESS : j ■ ::g : , : 
L TOPOLOGY, linear 



j _ .rtvrj : j -ATGGAGTC 
w 7GGJC77 jATCAGCCv" 



j - ^rvj-v. ^ j 

- GGG c s c s r 
^ - GATGATGT 
^ACG 2 GT AG 2 
jCGTGCTCCA 

-J*^; ^ r-v )AA T 2 

IS CA 2 AG 72 7 



CGCTAATG 2CCAGGTAGC 
jGGAGCTTGG 



- - JL.^v.J 



- v- ^ O _ O „ 

■"*-A 2 2- GG T 2T 7 
. . w\ ^ ' j CTG 7 



WO 99/42 MS rCT.rS99 ()326? 

U> ( > 



(B; TYPE; nuclei: acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

:ii' MOLECULE TYPE: cDNA 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 24 2: 

AGACCGGCGA GGG TGTGGTC GCTGCCCGCG GCATTGTCGA TAATCTGCGC TGGGTCGAC- 
CGCCGATCAA CTAGTGAGGC GCAACGCTAG GCTTTGGGAT a££SS£ SSS™ 
TCAAAGAAAC GAAGAAGGTT GCCATGAGCA CTGTTGCCGC CTACGCCgS £££££ 
CCGAACCCCT GACCAAGACG ACGATCACCC GTCGCGACCC GGGCCCGCAC SS^J 

TCGACATCAA attcgccgga atctgtcgct cggacatcca taccctcSJ S^gg* 

GG CAAC CGAA TTTACCTGTG GTCCCTG -CGAA.GGv, 
:< INFORMATION rOR SEQ ID NO:243: 

1< SEQUENCE CHARACTERISTICS: 

■ rt; -ENGTH : 125 arr.i::o .i cv a s 
;3. TYPE: amino acid 

!C- STRANDEDNESS: single 

■ - IOP0L0GY : linear 

ii MOLECULE TYPE: protein 
.x: SEQUENCE DESCRIPTION: 3EQ ID NC • "M ^ ■ 

A5P 31 7 317 ?rC Ala 7hr As - ?ro Gly Ser Glv Ser Arg GV 

v \ , , 10 15' 

^ n - .ud 'ji / Ksn AaJ Thr Glv 



1 - d 



Asd Sly Ser Phc 



Asn Sly Ser 
30 



€0 
120 
180 
240 

300 
327 



-s? 1./ Gly Lvs Ilv 51v Asn llv Glv ^ 

jq " ; 



D NO ZAA 



WO 99/421 IS 



ITT I SW V32b: 



Met Ala Ala Ala Sly Thr Thr AU Asn Val Glu Arg Phe Pro Asn Pre 

A,r. Asp Pro Leu H,s - eu Ala 3er rle phe ser prQ A , a ^ 

^ 5 

Val Thr Glu Gly His Arg Leu Arg Ala Asp Ala He Leu Leu Arg Arg 
T,r Asp Arg Leu Pro Phe Ala olu Pro Pro Asp Trp Asp Leu Val Glu 

■er .In Leu Arg Thr Thr Val Thr Ala Asp Thr Val Arg lie Asp Val 

90 

a Ser Lys Leu Tiir 



S 



0:3 70 75 

He Ala Asp Asp Met Arg Pro Glu Leu Ala Ala Al ^ 



3 5 9Q 
Ger Leu Arg Leu Tyr Asp Ser 
100 



95 



~ — ^^-i^.^rv ;uk oEQ IE NC:24 5 

- SEQUENCE CHARACTER 1ST ICS • 
;A; LENGTH: 4 1 am no acids 
(S' TYPE ; ammo acid 
IE.. STRAirD KDUESS : single 
{ D } TOPOLOGY : linear 



MCLECTJLE TYPE: nrotem 



xi SEQUENCE DESCRIPTION - 



^£.Q ^D N'C:24i 



^ je. ^ - ^rg Va. Asn Ala Pro Glu Ala zit 

Leu ?ro Ara Asn Glv 5-- -u- . , , , " 

, c ■ ^* .a- .V5 Arg Ar. 

' r? — p ^ ?rc Ser Asn \*a 1 Ann ~ 

- = 4C 



I INFORMATION TOR 3EQ TC ;;c 

SEQUENCE CHARACTERISTICS- 

^ EN G TH : IS a rr. i n c a c . i : ; 



WO 99/42 1 18 



PCT/IIS99/0326? 



ri 



(A, LENGTH : 61 ammo acids 

(B ) TYPE: amino acid 

(C; STRAND EDNESS : single 

(D) TOPOLOGY: linear 

111! MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION- SEQ ID NO: 24 7: 

Me: Ser Thr Val Ala Ala Tyr Ala Ala Met Ser Ala Thr Glu Pro Leu 

15 10 15 

Thr Lys Thr Thr He Thr Arg Arg Asp Pre Gly Pro His Asp Met Ala 

20 25 30 

Zle, Asp lie L/5 ?he Ala Gly lit: Cys Arg Ser Asp lie His Thr Val 

35 40 45 

■Gin Thr Glu Trp Gly 31;: Pro Asn Leu Pro Val Val Pro 
5 0 5 5 SO 

. 2 , 1 NFCRMAT ION FOR SEC. IE NO : 2 4 8 : 

SEQUENCE CHARACTERISTICS: 
(A' LENGTH: 213 base pairs 
(B! TYPE: nucleic acid 
(CI STRAND EDNESS : single 
( D ■ TOPOLOGY: linear 

:'ii MOLECULE TYPE : cDNA 

'xi SEQUENCE DESCRIPTION: SEC IE NO : 2 4 B : 

^ — _ ^^tiGi — r\_r.\o w vjn^ j j t jT CGCGGGTCGA TTCGTTCTCG 2 CGAAAGT CP 1 

— -At u-\GACC A C C TT G A C A C 1CAACCGGC0 3CCCGGCATC SGCCOTCOCG 2C0TAGAAGC 
. . . ja^-acj j ^ j/vf ^/\^ . . _ o CTGCT 2 IGCCPCPvTGC AGATCGCACA JGCTTGC7TC 



infcrxaticn 



:e ns 24 9: 



SEQUENCE CHARACTER ISTI CS . 
\ : EN GTH : i case oairs 




WO 99/421 IS 



(2 J INFORMATION FOR SEC 12 NO: 2 50: 

il< SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 0 base pairs 

(B) TYPE : nucleic acid 
(CI STRAND EDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi J SEQUENCE DESCRIPTION: SEQ ID NO : 250: 

AACGCGTGAT TGGCAAGGCG ^CCZCSZAGC GGCCCGTAGC CGCGGGACGG GCCAGGCCCC 

GACCGCAGCG GCGGGTGTCT GACCGGGTCA GCGACCAGCG GCGCTGACCG TZCCZC^CZT 

OTACTTCGAC GCCAGCGCCT TCGTCAAACT -CTCACCACC GAGACAGGGA GCTCGCTGGC 

GTCCGCTCTA TGGGACGGCT GCGACGCGGC ATTGTCCAAC CGCCTGGCCT ACCCCGAACT ^4 0 

JGGCGCCGCA CTCCCTCCAA "GGGCCSCAA ""CACGACCTA Ar^GAA — — ^ — CC~CA 

2GCC3AGCGT GACTGGGAGG ACTTCT 3GGC OGCACCCGCC CAGTCGAACT G^CGCSACO 360 

-jTTGAACAGC ACGCCGGGCA CCTCGC 2CGA ACACATGCCT TACGCGGAGC CGACACCCTT 420 

:2? INFORMATION FOR SEQ 10 NO: 251: 

C 1 ) SEQUENCE CHARACTERISTICS : 
(A; LENGTH: 2 99 base pairs 
3 TYPE: nucleic acid 
iC) STRAND ED NESS : sinqle 
;D,i TOPOLOGY: linear 

MOLECULE TYPE : cDNA 

x: ^EQUEVCZ INSCRIPTION: SEC ID NO : 2 5 1 . 



GCCGGrG3 ~ .j2cggcggg. tcagcttcag sactgccggt zr^r.T^cz 

ZZZCZGGCG^ GGCOGG~GGG ITGTTCACIA 2CGGCGGTG7 2GGCGGCGCC 3G7GGGCAGG 

-CACACGGG IGGGGCGGG- IGCGCOGGCG -CCCCGCGG 2-TTGTTTGG7 3CCGGCGGGA 

TCGGCGGGGC GGGCGGATTC jGGGATCACG 2AACGCTCGG CACCGGCGGG ^CGGCGGG 

2 tnfopmati::; fop. ;f~ — ^ 



LENGTH i, ; 
; CYPE i • : ■: . • • : 
: JTRAND EGRESS ■ ; : r.n 

: topology . 

VOLECCLE TYPE. pr-e. 
SEQUENCE CESC? : 



60 
20 
80 



2 4 J 

2 y y 



WO 9^ 421 IS 



PCir.SW. 03265 



<2) INFORMATION FOR SKQ ID NC:-S3: 

'.I J SEQUENCE CHARACTERISTICS : 
i'A) LENGTH: 121 amino acids 
3) TYPE; ammo acid 
!C) STRAND ED NESS : single 
(D) TOPOLOGY : linear 

MOLECULE TYPE: protein 

:xi; SEQUENCE DESCRIPTION: SEC ID NO:253 : 

]lu Leu Glv Ala Civ Gly A 1 a -1^ ^ — 

- ^/ vjly A^a Gly 31 v Ala 31' 

/ , 13 15 

^ P ^ y ?r ° Gly Ala Thr Gly Thr Gly Gly His Gly GI : . 



10 15 
.- ^iy A. a Thr Gly Gly Thr GW r.tw 

'jiv 3] 



— 30 
^eu Ala Pro Gly 31y a:, 

4 ^ 



4C 

j*y fi-a Arc 3er .Asp Glv G 

5 5 



j-l^ a 

55 50 



Thr Gly Gly Thr Gly Cly Ala Gly Gly Ala Gl; 

3 j 

I" 3ly Gly Leu Gl', 



Thr leu Le^ ' ~ i - > - ^ ~ ^ 

-e^ vjiy ^_ a G^v Gl 

35 



90 

: 3 c 



105 




WO W -12 UN 



^4 



PC T IS') 1 ) 032f>5 



; 1 1 ) MCLECUL E TY PS . p ro t e i n 

iXi; SEQUENCE DESCRIPTION: SEQ ID NO: 2 55: 

-si: . r a* Giv Glv lie GIv ■I 1 *' ^- ^k, f-i,, m 

- I ox * ,J *- Thr ^ly Gly Asn Ala 



GIv 

10 :5 



Met Leu Ma Gly Ala Ala Gly Ala Gly Gly Ala Glv Gly Phe Ser Phe 

20 25 30 

Ser Thr Ala Gly Glv Ala Gly Gly Ala Gly Gly Ala Gly Glv Leu Phe 
35 40 45 

Thr Thr Gly Glv Val Gly Gly Ala Gly Gly Gin Gly Has Thr Glv Gly 
50 55 60 

Glv Gly Ala Gly Gly Ala Glv Gly Leu Phe Glv Ala Glv Glv Met 
°~ 70 ' 30 

Gly Gly Ala Gly Gly ?he Gly Asp His Gly Thr Leu Glv Thr Glv Gl- 

A^a Glv Glv 



C 



INFORMATION FOR SEQ 10 NC : 2 5 6 . 



; 1 ' SEQUENCE CHARACTERISTICS : 
(A) LENGTH : 232 base uairs 
;3) TYPE ; nucleic acid 
:C) STRANLEDNESS : single 
\D) TOPOLOGY: l—ear 

■ii: MOLECULE TYPE: cDNA 

XI, SEQUENCE DESCRIPTION: SEQ 10 ;JC:2 56: 

Z^IZ^^" GTGGGCGGT'; TTGG GGGTGA OGGTGTGCCA TTCCTGGG 
^ooo^CCGGT GGTGCC3GCG GGC 



J - - -TGGGGCGAI GGCGGTGCG 

^CoGIGCCGG CGGGGCGGGC GGCAACGCE, 



o^iGG JCTGTTCACC GTCGGTGGGG 
•'-v.o - ^ j . j^v.jlhju3G TCCGGCGGGT 
■j - -GGGGTCGGG TCGACTA'^CG 



^ t 'j 
2 32 



- INFORMATION FOR SE 



WO 99/421 18 



ATGGTCCCAG CCCACTC3AC AGGAGGGGTG GCGAACATCG AGGTCAACAC GCGGT 4:5 
; 2) INFORMATION FOR 5EQ ID NO: 258: 



(!) GECUENCG CHARACTERISTICS : 
(A; LENGTH; 373 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
CD) TOPOLOGY: linear 

(ii; MOLECULE TYPE. cDNA 



;xi: SEQUENCE DESCRIPTION. GEO 10 NO : 25 3 : 

TGACCGCGTG AACGGTTGGT AACACTGATA GGTATGCTTG TCAGCGAGCA GATCAAGTCE 
^?™f" ACCA ATGCCA ^ GAG ^ICATCGGCT AGGCTCACGG TTTCGCCTGG GACGAGACGG 
. . GAGTTC TGGGC * . CCA EGGTCCETEC EJliiUTGGGA AGTCTGACGG EGGATGAGAA 

^:' T Z :3 ;™ l2 C ^ GT ~I; :GGGGGATA ~ ggcctatttg gtgtcgtggg gccgctccac .4 

EGGATEEC ..CGAACGi. GCGCAAGCGC GGTCCAGTTA CGGCCTGTTC ACTGCGCGC" 
GGCGTAGCTG GGCGGCETCG ATGGGTTTGA ACGTCATGGC .AATTGCCGCA ATGGGTGAGT 




li\ SEQUENCE CHARACTERISTICS: 
( A ; 1 ENGTK ; 4 3 3 base oairs 
<3' TYPE . nucle:.-: a:;^ 
i C : S TRANCE EONES S : single 
■D- TO PC LOGY . : : ::ear 

MOLECULE TYPE :ENA 

>;i sequence seeseie~:3n :h; 



3GCCCAGTAC CCGCGGGCCG AACGGCAGC3 ACGAAAGCAA EGG CATC GAT ACGGGGATC" 

:r^!:;^ G - TGGGT f GAr aagettgcgg ecggactcga acccggctg; 



'3 r.VAT I IN' EIE ^ :: 



iagtag: 



_ .J 



SEQUENCE CHARACTER! 
A ICENGTH 4 3 4 base pairs 
r! IV PE : nuc * e ; : ic ; : 
3 JTRANE E 2 NEE EE ; ; r. ; - 



60 
I2C 

I 3 <: 



3 CO 
3 6 : 
3^3 



:30 
:4 3 



WO W42I18 



I 76 



PCT l S l ) l » 03205 



-3 SACCGCSC" 



SCAT CAT GGT cttcgatgac 



•csattt: 



:nfo»mat:~>; fjh 3H :c 



i SEQUENCE 2HAPACTEP. 1ST" ~~ 
• -.ENGTH . 4 2- ;;ase oa;r: 

E 3 TRANE EENES 3 : ~ : ng 1 e 
2 n ^ t -rrr 



120 
130 



')X;:™5 ;^ iCGGCCA atcc "-=ac ctcccggtac gtcagctgac ca—gc-ca 

:^f^ TCAG ^TGCCGC ACCGATTTCG GCGAACCGGG TATG^C^C 

rr;r:: GAC GTC3 ™*t ccsgcaggcc gggtgcggtc ggatcgtcx- cgc™-™ 

cSSgaS —I: ™L G ™- — ci? 240 

^ - - - ^^-Cjl ^atcgtgccc agcgcact:'" 

CACTAGCAGC GTGACCTCAC — Tr^,„ „^„„ ^ ^lal^. -.^Gw^CTC 300 

caaacctct aSc^ gaSaaS ScSccS? ccT^ GAAAGTGCGA 350 

x 4 04 

(2) INFORMATION FOR SEQ ID NO : 2 6 1 : 

(i) SEQUENCE CHARACTERISTICS: 
iAi LENGTH: 421 base pairs 
'3; TYPE, nucleic acid 
'.C> STRANDEDNESS : single 
:0) TOPOLOGY: linear 

i:' MOLECULE TYPF ■ ~C*I* 

X- SEQUENCE EESC^'^'^M- -t^ — - , . 

. ^ „ ^~~^„„„ *I rT^-^rtrt^^v_^, ^.^.rv.AAGT TC3CACCCGG GTATCCGCC 

^.L^^LnnCL GG^GG^GT^l _ ,^ - v 

jTGCGACCAC TGAGCGOC- C ^GACCC GGCCGGTGCA 12 3 

-GCCTCTGC GCCCGGOCGr .Z^—- ^:*Hf? ^GGCAC CCGCCAATTG ; 8 C 



3a==ac3gca CGGccrrrr- tgcggccccc ~~Sgc~"~ -^--I! r:5 CAG::GAC 300 



420 

: 




\\ O 99 42118 



PC T I S99 0326 



1C; STRAND ED NTiSS : single 
^ TCPCLCGV- linear 

ill' MOLECULE TYPE : CDNA 

iXD SEQUENCE 3ESCRIPTI0N : SEQ ID NO : 263. 



SSSS SS^S C C C TGGCTAACT TCG — ~- 

GTGCGACCAC TGAGCC^S CC SCCTACTT GGCAAGACCC GGCCGGTGCA 

GTGGCTCTGC £~SSc" "GA^~^ GCACTATTCG ACAACGGCAC CCGCCAATTG 

GTGCACGTTG S^C- GCACCCCCCA GCATCATGGT CTTCGATGAC 

GACCACGGCA r^GC™ ±;;;::; = ^^^CG CAGCCGCGTT GACCAGCGAC 



— — — iwv.v.uL.Oi ^ 'JLHA 

~S SSS 5J:=;:; 

-^^TCCG ACG3CAAGCT GGTGCTGGG ."" AGCGCAGA— ^^C— 
AAGAACCCGC AGTTGACCGG CGTCGGCGCC GC^SSaG ^CTTGCC 



^ - ^format :c.\ t jce ce~ — 
• i sequencc character isrrr^ ■ 

;A: LENGTH: 73 9 base pairs 
(3; TYPE: nuc.eic acid 

(C-. strandemess : sinqie 

TCPCLCGV: Linear 
MCLECTJLE TYPE: ~DNA 
: SEQUENCE CES CP 1 ?C1 CL T . SEC IE jfc^ 



60 
120 
180 
240 
300 
360 
420 
480 



* x 



WO 99/421 IK 



PC T TS99 "03265 



i i 



GGAAGCCOTT AACCCGGCAC 

(2 J INFORMATION FOR SE* ID NO: 265: 

) SEQUENCE CHARACTERISTICS: 
CA) LENGTH: 523 base p<?irs 
(B) TYPE: nucleic acid 
CC) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

t'ii) MOLECULE TYPE: cDNA 

'XI) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 
ACTGCACCC3 GCAGGCGC3A — ? 

__ Tr .^l:::; — - — GCACTGCCOG TGGAGGCGCC .-n 

gggagagttc gacgaccg^ tcgacga^cg 3—™" ~™ — 

^coc™ ^££3 

-^^.^ — A.^ 3TCGGGTAGT GGCGCTTGGC GGTG^CGT^G — - — 
GGAGAGAAGA GCCATCGCGG TGTTCCG— A C ™r^~ ^nX^X^ — "'^^ 
— C^^G.A. ATCGGAGTGT CGGCCAATAT 36^ 

- CGCGCAGGCC 

GCTCGGCTCG GTCGCGAAGG TGGTCGTGAC 



. t ^^-vj ^ „ rt LU\iGCCG A 

TGTGGCCGCC GCCCAACTGC CGGGGTGGGA CGCGCAGGCC GTAACCCCGC GGGCACTG" 

C3AGCAACCG CAGGTCACTG AGCTGC" — - T rr- ~r.-^~~Z ouG ^- ACTG -^ 

jCTCGGCTCG r.-r~.r~.z^~. ^Z^Z -Ii:^" — " ^GCGGACC 



■2' INFORMATION FOR SEC ID NO: 2 5^- 

3ECUENCE CHARACTERISTICS. 
A LENGTH: 114 base pairs 
3 ; TYPE : n::c: ! e : c ac id 
C 3TRANDEDNESS : sir.cj.e 
D 1 TO POLICY . .near 

"!C LECULE TV"- "SNA 

• cecuencs cegc?.:pttc:;. sec - ;;c : u - 

;r ^-ggggta igagcgactt ccccggccgg cgc:ggcgc- ggagc 



SEQUENCE C" LARA C TE ? 1 1 " 
A LENGTH: 2 au- p ; . 

B T Y ? E : r.u:.e;i a c : J 

c strandedness ^; 



60 
69 



40 

3 0 0 



420 
48C 
522 



WO 99 421 18 



179 



TGA ?:;1 C ;: cgggcctttgc rcrcGoccrrr gtcgaaacc, -gcaccgcgc : 2 : 

,U -^ -oTTGAGC CGCZAGCTAC GCGGCACGGG AATCCAGAGCT CGATC3GCGC ' SC 

-AGATGC3C3 3TGGTGATCG CGC3CC3CAG CAACGAGC7G 7AGAGCACG - *" 24 0 

-o^GCAATAG GTCGTGTTCG GCGATCAGCT CGCCGCTTCG AACCGCC— GC-GC— ^n- 

TGTCCGTCAG GCCGACATCG ACCCAGCCGG TGAACAGG7T GAGGGCATTC CAGTCGC~C~ iV 
CGCCGTGGCG CAGCAACACG AGGCTG C ZAG TGTTTGCCAT ACCGGCAAGT CTCTCACGCA 

: T :::^ r C ; £1"™™ ^caaaatg cccgaattct cctcggtccg ctgSgS ; 

u^TCATACC GCCGAGGTGG TCGGCACCGT AACGGCCGGT T 5^ 



(2) 



INFORMATION FOR SEQ ID NO: 2 69: 



SEQUENCE CHARACTERISTICS: 
iA) LENGTH : 42 6 base pairs 
(E 1 TYPE: nucleic acid 
!C) 3TRANDEDNESS : single 
TOPOLOGY: linear 

:i. MOLECULE TYPE. cONA 

xi : SEQUENCE EESCPIPTION: SEC IT- ::o - b 9 : 
2TEEA3GCT0 ATTCGCTL'GA ACAAAGCCAC CGGGCCGTAC AGCGGACG"" — ^ Ar ^™ _ 

:;:^:;* G Iff::;*;: «»cc=™« cgaacctc^ cc^cca do 

— Irrrffl ;- GCACGAC — ^= "° 

* vj *" -^w.CTT CCGAC3AGGC GTGGATCGCC ^4C 
OGGCCGAGC: 



rCGAGCTGAG AGCGTAGCGO GT: 
rCGCTGGTCT TGTTCCC3CG UAGCACCTGC 



GTGAACTTGA CC3CGTCGAC ATC3GCGC3G 
3TC3T33C3r GC3GCAGGGG C3GCAACTGC 



seq :c 



iecue::c:: ":aracte.-::gt::c 

-ENGTH . jase oairc 

^ TYPE . e;c :icid 

Z 3 TH Aid S 0 N E S S single 
2; T0PCLC3Y: : ;.iear 



30C 
3 6 0 
42 0 
42 6 



MOLECULE 'YPE- 



:cj;a 




WO 94 42 MS 



ISO 



PCTTNW 0.1265 



(ii; MOLECULE TYPE : cDNA 

ixi; SEQUENCE DESCRIPTION: SEQ IE NO 



171 . 



/a i y G 1 y Asp GI '/ Va I Ala 



-e*j j^y Tn: 

20 

-ei: -he Se: 



60 
2C 
1G0 



420 
480 

54 0 



AAGATCATCC 3CGCCGCTCC TTAGCATCGC TGCGCTCTGC ATCGTCGCCG GCG CGGA""CA 
GCCTTGTACC CCACTCCTCG AACGGTCAGC ACCACAGTCG GGTTCTCGGG 
AXCCTTTTCG ACCTTGCCCC GCAGACGCTG GACATGCACG TTCACCAGCC TGGTATCGG r 
TGGGTGCCGG TAACCCCATA CCTGTTCGAG CAGCACATCA CGAGTAAACA CCTGGCGCGG 240 
CTTGCGCGCC AATGCGACCA ACAGGTCGAA TTCCAGCGGT GTCAACGAGA TCTGCTCACC 3 00 
GTTGCGAGTG ACCTTGTGCG CCGGTACGTC GATTTCTACG TCGGCGATGG ACAGCATCTC 3 60 
GGCGGGTTCG TCGTCGTTGC GGCGCAGCCG CGCCCGCACC CGCGCAACCA GCTCC~TGGG 
2 TTGAACGG C TTCATGATGT AGTCGTCGGC GCCCGACTCC AGACCCAGCA CCACATCCAC 
JGTGTC3GTC TTTGCGCTGA GCATCACGAT GGGAACACCG GAATCGGCGC GCAACACGGG 
2CACACGTCG ATGCCGTTCA TACCGGGGCA A 

2) INFORMATION FOR SEQ ZD NO: 2 72: 

. SEQUENCE CHARACTERISTICS: 
LENGTH: ?3 ammo acids 
•3; TYPE: ammo acid 
(C) STHANDEDNESS : single 
(C TOPOLOGY- linear 

■ii MOLEC77LE TYPE: protein 

xi SEQUENCE EE3CP. IPTICN : 3E' V Z2 NO: 2^2: 
eu ?he Gly Ala Giv Glv 7a'. ■ 



21-, 



■i*a 



7a 1 



WO 99/421 IS 



81 



PCTIS99 032fo 



;2) INFORMATION FOR SEC ZD NC:J?4: 

, i ) 3 EQUEN CE CHARACTER ISTTCS : 

(A) LENGTH : 2 6 amino acids 

(B) TYPE: amino acid 

(C) 3TRANDEDNESS : jingle 

(D) TOPOLOGY: linear 

(ii; MOLECULE TYPE : protein 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 274: 
Pro Asp Arg Pro Ala Ala Thr 7a 1 Giy Ser Cys Thr Thr 7al Ar: 

Pro Cys 3er 3 In Pro .ai Thr Thr Ala 



.2) INF 0 RMAT 1 0 N 



OR 3 EQ II NO .2^5 



- ^ W u^C£ CHARACTERISTICS 
Ui. uENGTK : 2 0 amino acids 
^Bi TYPE: ammo acid 
X- STRANTEDNESS : single 
C; TOPOLOGY: lm«ar 



MC1ZC-E TYPE: ::rotem 



SEC^NCE DESCRIPTION: ^ TD NO:2~5. 



■■'e': Hi; 



WO 99/42118 



PCT/l'SW 03265 



50 5s 60 

Pro Gly Ala Asp Ser Ala Ala Pro Ala Ser He Met Val Phe Af;p Asp 
65 70 7b ac 

Met His Val Ala Pro Arg Val He Phe Leu Pro Gly Pro Ala Ala Ala 

B5 90 95 

Leu Thur Ser Asp Asp His Gly Thr Ala Phe Leu Ala Ala Arg Gly Gly 

100 105 110 

Tyr Phe Val Ala Asp Leu Ser Ser Gly His Thr Ala Arg Val Asn Val 

115 120 125 

Ala Asp Ala Ala His Thr Asp Phe Thr Ala lie Ala 
130 135 140 

INFORMATION FOR SEQ ID NO; 2 77: 

:i) SEQUENCE CHARACTERISTICS: 
!'AJ LENGTH; 142 amino acids 
>3) TYPE : amino acid 
S STRAND ED NESS : single 
D TOPOLOGY: linear 

(ii: MOLECULE TYPE: protein 

ixi SEQUENCE DESCRIPTION: SEQ ID NC:2""": 



r Leu Asn Ala lie leu Arq Ala He Phe 31 v Ala 



3 10 15 

:;■/ Ser Glu Leu Asp Glu Leu Arq Arg Leu He Pre Pro Trp Val Thr 

20 25 30 

_eu Gly Ser Arg Leu A- a A_a Leu Pro Lys Pro Lys Arg Asp Tyr Gly 

2 5 4 C 4 b 

;ru leu Ser Pro Trp Sly Arg Leu Ala Glu Trp Arg Arg ]ln Tyr Asp 

5 j 5 5 ^ 

Air '.'a. lie Asp Glu Leu lie Glu AH Liu Arg .A _ a .^sr Pro Asn. D r:e 

Asp Ar^ Ihr Asp Val. Leu A^a „eu Met Leu .^rg Ser Thr Tyr Asp 

db 90 

\jo Sly Ser l^e Met Ser Arg Lys Asp lie Gly Asp LLu Leu Leu Thr 

IJQ ;C5 11C 



1 INFOPMATTLN F~P ~Z IT 

LENGTH- IS': hit.: r.u -t:uu:- 
TY P E : a rn i n c a ■:: : u 



W O 99 421 18 



18> 



PCT I S ( > 1 > 0326? 



Val Leu Val Ala Cly Cys Ser Ser Asn Pro Leu Ala Asn Phe Ala Pro 

5 ;o 15 

Gly Tyr Pre Pre Thr I.e GIu Pro Ala Gin Pro Ala Val Ser Pro Pro 

20 25 30 

Thr Ser Gin Asp Pro Ala 0 1 y Aia Val Arq Pro Leu Ser G 1 y His Pro 

35 40 45 

Arg Ala Ala Leu Phe Asp Asn Gly Thr Arg Gin Leu Val Ala Leu Arg 

50 55 60 

Pro Gly Ala Asp Ser Ala Ala Pro Ala Ser He Met Val Phe Asp Asp 
65 70 75 80 

Val His Val Ala Pro Arg Val He Phe Leu Pro Gly Pro Ala Ala Ala 

35 90 95 

Leu Thr Ser Asp Asp His Gly Thr .Ala Phe Leu Ala Ala Arg Gly Gly 

100 105 no 

Tyr ?he Val Ala Asp Leu Ser Ser Gly His Thr Ala Arg Val Asn Val 

115 120 125 

Ala Asp Ala Ala Thr Asp Phe "Thr Ala lie Aia Arg Arg Ser Asp 

-10 14 C 

Lys Leu Val Leu Gl y Ser Ala Asp liy Ala Val Tyr Thr Leu Ala 
i^ 5 150 15 5 16 0 

Lys Asn Pro 



(2) INFORMATION" FOR SEC IE NO: 279: 

. i; SEQUENCE CHARACTERISTICS : 
(A; LENGTH: 24 0 ammo acids 
(B! TYPE: am no acid 
;C' STPA::C FEMES C . single 
:D) TOPCLCCV: linear 



Ihr Asn ;r: 



10 15 
-cu Val Aia Gly Ala Ala Ala Val Va* 



WO 99/421 18 PCTTSW 03265 

18-1 



Lys Trp Lys .\sn Cys Ala 13 1 y Lys Thr Vai Thr Val Thr As n Lys Ala 

165 170 l' r b 

Lys Thr Tyr Arg Trp Thr ?he Ala Asp Val Lys Gly Ser Pro Pro Thr 

180 185 19C 

lie Thr Val He Asp Thr Gin Glu Gly Ala 31u Gly Trp Glu Cys Gin 

195 200 20b 

Arg Ala Met Ser Val Ala Asn Asn Val Val Val Asp Val Asn Ala Cys 

210 215 220 

Gly Tyr Gin He Thr Asn Gin Ala Gly Gin He Ala Ala Lys He Cys 
22b 230 235 240 

[2} INFORMATION FOR SEQ ID NO:280: 

: ; SEQUENCE CHARACTERISTICS : 
;A; LENGTH: 2 2 ammo acids 
B ) TYPE: amine acid 
C: 37RANDEDNESS : single 

0 \ TJPOT.nrv/ • 

. n MOLECULE TYPE: prstei.n 

■ x : ■ 3ECL T E*. T CE DESCRIPTION*: SEC in N2 : 2 3 2 : 

Asp Val Val Glu Ala Ala He Ala Arg Ala Glu Ala Val Asn Pre Ala 

1 5 10 15 

Leu Asn Ala Leu Ala Tyr 
2 G 

2 INFORMATION F2R 3EC IE NO : 2 9 1 : 
A . LENGTH : amine acids 

: _t?ju«tzd:jesg : ^mqi- 

1 TOPOLOGY . ir.ear 



WO 99 42118 



1>( T 1SW 03265 



IS5 



115 120 12 5 

Trp Asp Ala Gin A^a Val Thr Arg Arg Ala Leu Civ Glu Gin Pro Gin 

13 C 135 140 

Val Thr Glu Leu Leu Pro ?he Gly .Arg Pro Gin Leu Ala Glv Gly Pro 
: - 5 150 155 * :60 

ueu Gly Ser Val A^a Lys Vai Val Val Thr Ala Arg Ser Leu 
165 170 

(2) INFORMATION FOR SEQ ID MO: 2 82: 

(i) SEQUENCE CHARACTERISTICS: 
lA; LENGTH: 61 amine acids 
(3J TYPE : amino acid 
G ^ 3 TRANE ED ME S S : s i ng 1 e 
;D) TOPOLOGY: linear 

MOLECULE TYPE: protein 

3EQ T JENCI DESCRIPTION* ■ SEC IE MO • 2 3 Z ■ 

ii ^ G^y 'va^ Va^ G^y Val Gly Ala Thr' Ser ?r~ A! a Glv A" a G'v Ala 

■^y A^a 3iy Ser A^a Gly Thr Gly Ala Gly Ala Glv Glv Glv Ala ""hr 

ys G*y Arg I_e Asp Ser Ala 3ei Ala Le^ Ala Ala Pro Leu Ser Thr 

35 -tC 45 

lv Leu Leu Ala Val Pre Ser ~iu Thr Thr Asn 31:: Arg 



: : INFORMATION ECP 3 EC II 

SEQUENCE II-iARACTEEIGTICG . 

B TYPE, arc in:; acid 

STRAND ED NESS : Jingle 
D : TOPOLCGV : linear 

: MOLECULE TYPE: protein 



-lb 

■ a. ^eu Ivi ir.r .Ser L>?u Leu Ar-j Ar^ 



WO 94/42118 



PCT ;l'SW -0326? 



IS(> 



?he Met Ala Trp Arg Arg Ser Tyr Asp Thr Pro Pro Pre Pro He Glu 

115 :2C - c 

Arg Gly Ser Gin ?he 
130 

(2; INFORMATION FOR SEC ID NO:284: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Ui; MOLECULE TYPE: protein 

!xi; SEQUENCE DESCRIPTION: 3E~ ID NO: 234; 



J re J I*/ Ser Phe Ala /Arc: "h-- ' 3>-- 



Arg 7:n Ala Asp Aid 



Arg cys Arg Asp Ser Arg Glv Thr Ala 



i n. 



H i s A r g Ala Leu 



Glu Pro Pro Pro Arg Gly Ser Glu Pro Ala Arg Arc: Ara Ser Arg 

7a 1 Arg Thr 7a ; Val His Asp Ser Leu Ala Ala Arg Arg Val 

5C 55 50 

*2: INFORMATION FOR 5H1 ID NOtZBS: 

: 1 ' SEQUENCE CHARACTERISTICS: 
A' LENGTH: " I amine acids 
3 ,■ T Y ? E : dinir.c; a c i d 
C STRANDEDNESS : s:no.e 



WO 99/42118 PCT I 0326> 



\D) TOPOLOGY: linear 
■,11) MOLECULE TYPE- prate in 

ixi; SEQUENCE DESCRIPTION : SEC ID NO :28b: 

Asp His Arg Arg Ara Ser Leu Ala Ser Leu Arg Ser Ala Se - Ser Pr 
1 5 10 



15 

er 



Ala Arg He Thr Glu Val Arg Pro Cys Thr Pro Leu Leu Glu Arq s 

20 25 30 

Ala Pro Gin Ser Gly Ser Arg Asp Pro Phe .Arg Pro Trp Pro Ala Asd 



Aid G J v 
~ r, 



35 40 45 



His Ala Arg Ser Pro Ala Trp Tyr Arg Leu Gly Ala Gly Asn 
55 60 
Prj He Pro Val Arg Ala Ala Hi 3 Hi 3 Glu 



INFORMATION rOR SEC 



SEQUENCE CHARACTERISTICS : 
(A: LENGTH: : ~ 4 case pairs 
(B> TYPE: nucleic acid 
(C: STRAND EDNES3 : single 
(D; TOPOLOGY: linear 

1 : MOLECUL E TYPE : :: DNA 

1, SEQUENCE DESCRIPTION: SEQ 1Z 



JCGCACGTAA OACCCTGAAT TGAAGGGAGC 



jG.CAT GGGCCGATTC 



^™;:? ;;" A ?^::: ^ gagg::::a c- ™ctgccac eaagtggtca ctcagcgcgt 

— ^.^o^jw^ .aCGAACGGCO SACAIACCA" ~~"~;a. " - ~ ^ 

A, ^ENGTH : i C 4 base pairs 



STRAND- 




WO 99/421 IS 



I8S 



|>CT'rS99'032d5 



i SEQUENCE CHARACTERISTICS: 
(A, LENGTH. 134 amine acids 
iB; TYPE: amino acid 
(C; STRANDEDNESS : single 
iD! TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 89: 

Ala Asn Gly Val Thr Phe Arg Pro Val Ala Leu Glu Ser Leu Ser Kis 

15 10 15 

Phe Pro Val Thr Val Ala Ala His Arg Ser Thr Glv Glu Leu Thr Leu 

20 25 * 30 

Leu Ml Glu Val Leu Asp Gly Ala Leu Gly Thr Met Ala Pro Glu Ser 

35 40 45 

Leu Sly Arg Arg Val Leu Ala Val Leu Gin Arg Leu Val Ser Arq ^r;> 

5C ^ - c ^ ' ' 

Asp Arg Pru Leu Arg Asp Val Asp He Leu Leu Asd Glv Glu His Asc 

63 ^ ' 30' 

rrc .nr Pro G.y Leu Pre Asp Val Thr Thr Ser Ala Pre Ala Val 

93 55 



85 



Arg Phe Ala Glu I : e Ala Ala Ala Gin Pro Asp Ser Val 



100 



A_a 

105 110 
a. Ser Trp Ala Asp Gly G:r. Leu Thr Tyr Arc Glu Leu Asp Ala Leu 



v^a Asp Arg Leu Ala 
1 3 C 



1 INFORMATION ;1R 3EC IT NC:C?C- 

^cvUENCE CHARACTER 1 3T T CI 
i LENGTH : jLj case ua i : s 
i TYPE: ^c.e:: i::^ 
■ ' STRANDEDNESS . smuue 
: TOPOLOGY: linear 

MOLECULE TYPE . ::ONA 

SEQUENCE OEGCRirTI;r; "v* ■ 




WO 9<> 42118 
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'.3' TYPE, nucleic acid 
i'CJ S7RANDEENESS : single 
ID; TOPOLOGY; linear 

■ ii! MOLECULE TYPE : cDNA 

[xi) SEQUENCE DESCRIPTION: SEQ 10 NC:29I. 
SScS JS^: ^fH5 CCAGACCCCA « 



ATCAGGCCGA TGCCCATGAT CACCGCACCr ^GACCCCA ACGATCATCG 

TAGACGAAC- CCt3Gf4rar C.CACCAGCA CCGCGGGCAT GCCGGTGGAA 

ccSSaS Sgc^acJc ac^a" GAAAGACGGC GCCGACAATG 

CGCGGGTAGG CGACGGC^ ^™ TGACGTGCAC ATCTCGCTCC 

TAGG7GATGA TCGCC3CG™ ^^IZ CCACCGGACC GGTCGCAAAA 

CC3CGATAGG .CAGGCCG.Ig ^GCG^G ^CC*** 
AGACCGTACT OCAC^CCr G^GACCTGA ^~e3C~G 



no : jy^ 



2 INFORMATION' FOR 3EQ 

: - SEQUENCE CHARACTER 1ST ""S 
■A LENGTH: 52o base pairs 
(3' TYPE nucleic acid 
-C" STRANEEDNESS : si-c;le 
TOPOLOGY r linear 



11; MOLECULE TYPE : cDNA 
■X, SEQUENCE CESCRIPTION CEQ 10 NO : 291 
J ^CAATATGAG 



jyuuwiftC 2CGGCATGTA OGAGCTTGAG TTECCGGC 



j-jTCGC GTCCTTCG' 



- j^CAG 



^ ^11^"^" ' J3TaTTGG "F -CACGCTTTG 3AAGGT 

— . oGCCG CCGCCCA 

- - ^-^GATGAAC 

" " A • ~ — - _ E CAT OAT 



^AGGCGG C C 2TGGACACAG 
~ ■ >>'■.. w ^ j o -JuL^rt, TAA 



CAGCTAAGC C TO TAT' 



— j j ^ ^ » 



; ~„ ""^ 7G =~33CSGG 7™GGAGC'.o JA'J.TGAAGT 

GG -eke---- ; GGCC3AGC3 -^K^GTA -'GSCACAACC 

>CA ACCCGGAGCT ^-J^ ~~ ^ll ^ATCAC JATGACC3CT 

" " ' - • --—-^/v-^.. -..AT'"**" ■ 



3 ; ; ; : 



W O 99 421 18 



l l H) 



F'C'T VSW 03265 



~ afY ~ 5rf TGGTG C^GGTTUAC GTATTGTTCC ACCGGCCCGG 

:;" CC : JI^^ t ^tcgcgg CGTCGA7GCA TCCGCCGACG acgacgtgcg 



aagcctcgcc tgccgccgca gccgcccaac tgtgtggcgg ^c^cS 
:;r C T ^; ^gggcacc accggagcct gcggccgtct ggcggaggcc agS£S£ 
-o^TCACG ccatacgcga cggtgcgccg ccgcttcgca gatttgcagg ctgcgt-gca 

CCAGA TCGAG CACCGGTCTC CCCAGGGACT GGGTTACCCC GTTGGCGCCG SS5£ 

Sc SS ATATCGGTGC CCACTCGACC CMCStxac TCCATAAG - aSSSS 6 00 



12) INFORMATION FOR SEQ ID NO : 294 : 

:i: SEQUENCE 'CHARACTERISTICS • 
(A) LENGTH: 164 amine acids 
■ 3 ■ TYPE : ammo acid 

:r; strand edness : single 
iD) TC PC LOGY: linear 

.''"OLEC'J^r. 7YPE: crcte;:. 

■xi. SEQUENCE DESCRIPTION: SEQ ZZ NO : C y 4 

Rr.~ Asc GIv "V^ *^ ■ -v-^ — - . - 

: ' * ~~ ^ C/3 Al3 G.y Ala Tyi 

Asp Asp Lys Ala Ivs Lvs Th- - v ^ _ , 

— - 7a. ^.a .eu Pr.e Ala 

Vj: Ala ?: y Vd: : ' ys - L - - : ^ 3i v a:, g: : , g:, ~L cv S As , 

-! F A ' a Ar3 A - ^ ?- h - Ph- J- Glr. Leu 

.-..a Va: r.u :'hr Veu As= 31;, Leu ?he 7a; Glu 

.. ~ 5 ' 3C 

..a. .^sc -.r:: ,y^. 

V'- — "vf: -he As:: Thr 

■ ^ - ^- ^ . ^ - 

*■** " j ^- ^ ^v*- -.m: h±x 

* :1J :nr Leu Le - Ai5ri A ~ a sf v , s ai, . /a: 

-r.r Yal ier 11- --^ . I*"' 

" ' -- r - Ji3 -^sp -Tvs T-/r Le^ 

^ :.»c 



240 
3CC 
360 
■*20 
480 
540 



610 



~ :-:^':rvAT:::; - 

S E ^ CENCE 1 1 LARA GTER Z 3 T 7 I S 
rt --"-71TH. I -J 7 3!".:r.c 

? TV PL: .ir^::;c acid 

■ "tv.a:;::e::neg~ 
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Arg Arg Arg Asp Leu Ala 31. • 31'.; Leu Arg Gin Cvs lie 

1 3 10 ' - 5 

T-.r lie lie Asp 31a Aia Asp Ala His Asp His Arg Thr Civ H's 31- 

His Arg „ly his Ala Civ 3iy n„ A. sp , lu Pro Pro sly Q , a „., g ^ 

3 f -10 4S 

■eu Gly Gly Lys Lyr Ag ? Gly Ala Asp Asn Ala Gin Glu His Ara 

, ° 55 60 

Gin Pro Thr His Pro Arg Gly Arg Arg Asp val His lie Ser Leu Pro 

65 75 

Arg val Gly Asp Gly Ser Gin Ala Thr Gly Gin His Pro His Arg 

95 



35 

Lys L 
50 



Gly Arg Lys lie Glv 
100 

Leu Thr Gin Arg Asp 
115 

- - - r 3 v As n A I .i u~ 1 * * 



9C 

sp Asp .Arg Arg Gly Gin Pro Asp Gin .Arg Lvc 

:ir Gly Ala Ala lie Gly Gin Gly Glu Gin Ala 
120 12 5 

:3 = 140 
G.u Gx- :.e.: Asn Thr Arg Arg Thr Cvs Asn Ser Gvr, -v- r- 



2 '' -FOPMATTC;: ?CK IT NC.296: 

-/ SEQUENCE CHARACTERISTICS: 
,A, LENGTH : l"5 am: no acids 
:3" TYPE: ammc auiu 
;C. STZANDEDNESS : single 
-CPCLCGV: linear 

l Ar -7 Glu ~V "V ■ ■ d v- - ■ 

— - - .ir. Pre G.y Met TVr Glu Leu Glu 

> Ala Pre Gin Leu Ser Se- ^ . ... . ' J . . 
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= -50 155 16C 

: G.n Gin Pro Gly Ala lie Ser Asp ?he Gin Pro Phe Asp Leu 
165 170 

(2! INFORMATION rOR SE^ ID NC:297 : 

li) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 179 amino acids 

(B) TYPE : amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

MOLECULE TYPE: protein 

xi: SEQUENCE DESCRIPTION: 3EC ID NO : 2 97 : 

Pro Vai Lys Glu Pro 7a 1 Pro Ala Lev: °ro Pro Va" p r - 

Aia Leu Pro Pro Leu Pro Pro Leu Pro Pro 7a'. Pro Glv ^"<- p*~~ 

^0 25 " 3 0'' 

7a 1 Pro Pro Pro Gly Ser Met Ala Pro Leu Phe Aru Pro o'-e ^e- 

35 4C 45" 

Ala Pro Pro Ser Pro Ala Leu Pro Pro Ser Pro Pro Leu Pro Pro 
5: S3 60 

7al Gly Vai Ala Ala Trp Leu Thr Tyr Cvs Ser Thr 



A_a 



Asp Pro Leu Ala 7a 1 Ser He Ala A^a Ser Met Aoo Pro Pro Th: 

35 JO ' 95 

TOr Sys Glu Ala Ser Pre Ala .Ma Ala Ala Ala Glo Leu Cvs Arc 

100 105 li: " " 

Ser Jvg Asp Leu A^u Pro A^a Asr Slu Met Met Glv Th*" ] - 

115 1 2 '2- 

_ys j.y .vro ^eu ;^y o.u A^a Ser A- a Jly Ser Aro Ser Aro L:: 

— I ^ 14 0 

Ster .:er Jer Jlv 7ai Pro Aro \sr ""-r ",1 ' r- t>*-~ * - ■ ., --^ 



-21 o a a e pa: 



"■'ANT-" 4 ;-" 
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A. .TCAACCC ANGCAGCTAC CACACGGGGA CTCGGAAACA CCGGCGATTT TACACCGGCS 
CCTTCATCTC CGGCAGCTAC AGCAACGGGT TTTGTGGAGT GGAAATTATO AGGGCTCATT 
GGNTG CAC C C GGSCTTRCGA ATCCCTCGKG CCAATTCAAC TCCTCNACAA GCTTGCGGCC 
GCACTCSAGC CCGGGTGAAT GATTGAGTTT AACC3CTNAN CAATAACTAG CATAACGCGT 
TKGGGCCTCT AAACGGGTCT TGAAGGGTTT TTTGCTGAAA GGANGAACTA TATCCGGA^A 
ACTGGCGTAN TACGAAAAGC :GCACCCATC GCCTTCCCAA CAGTTGCGCA ICXGAATGGC 
AATGGACCNC CCTKTTACCG GSCATTAACN CGGGGGTGTN GGKGTTACCC CCACGTNACC 
GCTACCTTGC CANNSSCCTN RSCCCGTCTT TC3TTTCTTC CTTCCTTCTC CCMCTTCGCC 
GGTTCCCNTC AGCTCTAAAT CGGGGNNCCC TTTMGGGTTC CAATTATTGC TTACNGSCCC 
CCACCCCAAA AAYTNATTNG GGTTAATGTC CCTTMTTGGG CNTCCCCCTA WTNANNGTTT 
TCCCCCTTNA CTTTGRSTCC CTTCYTTATW MTGAMNCTNT TTCCACYGGA AAAMNCTCCA 
CCNTTYSSGS TTTCCTTTGA WTTATMRGGR AATTSCAATY CCGCYTTKGG TTMAANTTAA 
TCNA ATTTTCCCGM TTTTMMNATR TTNSNCKCGM KNCTCCNRKA jGGin 



TSS GKTYCOCCRN G 



180 
24 C 
300 
3 6 0 
420 
480 
540 
GOO 
660 
720 
780 
840 

"COT 900 



information for seo id . v ;c:i99: 

;A: 1ENGTH: 1382 base cairs 

3; TY? E; nucisic acid" 

(CI 3TRANDEDNESS : single 

*T . TOPOLOGY, linear 
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M> 1 



,xil SEQUENCE DESCRIPTION : SEQ ID NO: 300 



6C 



hA.TGGCACG AGTGACCGCG CTGAAGCCGG TAGCGCGGGT ZGCTCGGGZG GTTTGCGAAC 
RAAATCCGCT CGANGTGGTC TCGGTAGCCG GTGTCCANAA CGGTGGCGCG GTGCCGGCGG 1-0 
ATCTGATCGG CGCGGCCGTA GTGCACGTCG GCGGGCGTGT GCAGTCCG VT GCCGGAATG " 



80 
240 
300 



54 0 

600 
6 6 £ 



9 0 

3GC 



TTGTGTTCGT GGTTGTACCA GCCGAAGAAC CGGTCGCAGT GCACCCGGGC CGCCTCGATC 
GACTCGAACC GTTTCGGGAA ATCGGGCCGC TACTTGAAGG TCTYGAACTG GGCCTCAGAC 
AACGGGTTGT CTTGCTGGTG TGCGGGCGTG AGTGCGACTT GGTGACACCG AAGTCGGCCA 360 
NCANCAATGC CACCGGTTTG GAACTCATCC ACAACCCCCG TCCGCGTCMA GGTCACTTGT 420 
NCGGCGCTAA TTTNYTGGGC GGCAAGGGTT TGCCGAYCAN KCCGCTCGGC CAAAACTTCG 4 80 
ANTCNCSCCA AGGCCNCCAT CCNCCCAAAI AMGTTACGGG ANAAAANATY CAAAGAYCAC 
CYTCCGGKTN TTATANCTYC CCYTTTGSTY GGCCCCCCCN CYYTGKKNAT ACCCCTNC-A 
AWTCCCAACN CCCKCCAANA RCYXGGGGCC CCCNCCAACC CGGGKGAAJCA WTAATTTAAA 
CCCYAACMAW ACTWMMNACC CNNGGGSCCY AAMCGTYYNR AGGTTTTSCT NAAAGAAASA ^0 
-TCGGAAMC CGGNTSTACC AAAAASCCCK G-JWrr-CTC CRAGATTG3C .CCSAAWKSA 

— — ~sgc:iwmnc isgcggkkxt -jcgttnccct wmrcwmwyts gggcna^c-: 

-r^^SM^C .CCCTCCCCM CTCCGNKTCC 'J GAM C CYAN C MGGCCCCYTM GKKCCCWKN- 

"~ AMMNNNGGGG WGACCCTNGG GCCCMKRRGM TCCCNANTGA MCCTTWGNRA 960 
_:CCNRAR ANMCCSCNCC NGCNCRGKNN 

9 9 C 

'2: INFORMATION FOR SFC in MC : 3 0 1 • 

: SEQUENCE CHARACTER I 3~: 13 ■ 
A. LENGTH: 22 3 oase pair^ 
: 3' TYPE: nucleic acid 
■ I : STRAND EDNE53 : z -.ngie 
D TGPGLGGY: linear 

i: .VCwECULE TYPE: Genc~_ IN A 

SKG-CENCE DESCRI FT I IN , JEI ;d *;c : 3 0 ; ■ 

_ _ _ _ „„.,, r ,„ r , aUXJ " — - .-lA^ ^j^. wvjr^ . , - - IG IGGTG -J CGGTGGGGGT ;C 
^^^ JJ_J " "~ ^ j^ 1 - ov_ auu*-o^ _^ n ^ JCGGGGTGC" IT G G T CAT G G GGGCGCTGGC ' " " 
, ,^,,„ r ,,, ^ uouo ' .i.'A^viG.GL.n -^Lwx'jGTI^ GGATGGGGCG l^C 
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-Ao^CAGGC u j. -GGCCGCA GTCAACGCGC CCATCCAGGC CGTGACGGGG CGCCCCTGA- 
^^ACGLG ccaacggogc cccgggcaac ggggcccccg GCRGGCACGG GGGGTGGTTG 
^ GG^^rGCG GAAGG AA C GG OGGGTCCGGC GTCANCRGCG GGGCGGGCGG AAATGCCG 

INFORMATION FOR SEQ ID : 3 0 3 ■ 

£:) SEQUENCE CHARACTERISTICS 
(A) LENGTH: 1049 base pairs 
(3) TYPE: nucleic acid 
(CI STHANDEDNESS : single 

(GJ TOPOLOGY: linear 

ii MOLECULE TYPE: Genome DNA 

xi. SEQUENCE DESCRIPTION : SEC ID NO : 3 0 3 : 

AATTCGGCAC GAGGGGCACG ATCGCATACA _ _ 

. . .„,„„„^„ ^, , ^ ^ „ " * . . I ACAGCA 

— — — \ .jl G A GOG C A DAATA "GG ""^ 



''^f^f ^^ GC3GGC: ACC3A3TC -- AGACGGTAAG CGTCATGGC 
3GGAGATCAC CCCCACCACG CCCTT IGGTT GATAGCACAC CGTGGTCr, 



3 00 
36: 
413 



-"■j-'j-C CGGCTTGARC ACCACCOCGT ' ' 

3 TAGTTCOACG 13, 
j GG^^IGGTT GATAGCACAC CG~Of;Tr™~ 

CACCAGCGG CTGTGCC^~" 



_ T _^_ G ^ A __ ^"" G ? : ^; ::A - c "GGTCCAC ACAGACTCGT GCSTTATAAT 30 ; 

:;;rr:;; G : agcaagtcca tgaagaactc gcggttctcg ato^ggt ogogatIgcg 

:::r;^ G ; ^ - CGCT ~ A ^-G /.CCTTCGCCA GTCGGTCTGC GC0GCOCOAN 43C 

;r^^ TCG AC -- GCG - ~ G ggaatcntat oa:gggttgc 5..- 

AACSANNCAA ACCTCGGCAA GGTTAGGMTT OCCCOG^ ^CA^AA^C S^5S ^ 

GXCNATGKTG mcaaggmttt cka^aakcg gggtcytcti; ntcngkggak 

^GGGMAC 0 SKNr^CCAAli CCTWACCCT'J KTKAANCGN* ttccccgcgg -3- 

y^"::!^ : ;:::^f; G '1 IM : ARA7TG ^' c ~gmctc —sggawtc ^] 

y>!ogsaa^tjg anmcnctc"^ ".*...*'* \\\^^!"lZ - -r. 

_ \~ ^ c ^ttnrt Ul A:A _ _ -CMA/\AT V AW~™O0C -6' 

:;i:t; G ^:; ;;. C ;;^;;; T • ■■---mvc: >!Mmsnc:-:sng :ixggn?.cc:jn ::: \ 
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/TTAT . . CMS 3CTNAYGGGA A73AMRGGAA CAAYNTCCC: 
GCTGGTNSYC CNCCCRCCNC AKAACCCRTT KCTG^STGT 
^^f NCSG AANTNSCCC - CCCSCXNNTT ATSTYCCCGK 
TCCCC3GTTA ACCCCCWIGrr SNCNCCCCC5 YTAAXMNCRG 
CNCCCGCTCX SAMCWNCCTJC CTCKAAOiAr r-vrw,o W 



~*iv,W* LIUH iV .iv.LLi 

CNCNCTCCSC MM KG SC 

(2) INFORMATION FOR SEQ ID JJO : 3 0 5 . 

. i. 3EQUENCE EHARACTERIS7IC3 ■ 
^Aj LENGTH: 1036 base pairs 
■3: TYPE: nude:: acid 
C: JT?ANDEEJ:eSS single 
E . TGPCLCG i : . : near 



G C GM GGAAAA 


ACCAACMSGC 


540 


CC3MAAA7NA 




600 


GT7CCCCEMC 


CECTTNAAMC 


560 


GC77S77NC7 


GCCCCY 7RMK 


720 


TNCGCAATirr 


WCMWCXCCMS 




WTATAAAACw 


WOJYAWYNNK 


840 


CXMCCCXCSW 


TWCYCKCSCC 


900 


MCNMBMKTCT 


YCSGKTWCWC 


960 


CCWCCCCCCY 


KKCTCTSKCC 


1020 






1036 



genomic ON A 



jGATGGC G' 



AGA7CA70A ATAGCGGGC" 3n~^ - — — ^.^^^^ 

^* ^--.-^ on^o.^^j -j^GA7CTC3C 6 0 

k Jv — "~ — m.G CAGG A GGTGGGCATC GA7G CGGA ^ ^ 

^^^^vj^^^,^ A G G 7 AAGG C G GACG : '^~:c "*r: -^.^^^-^^^^ 

:":::^rr: ^ ccaataac ~ --ac— G a ccaactccgg ™7cga7cg ^, 

--w^o^s-^GA GCGAGGG' 



..jATGGAAG 7AAGAAAC GG 



rSCCATGCGG 37 G3C CAAG"" 



GGCC0AAGG7 GAGGTGGTGT 3 0C 
ACGAC7GACC GAGCAAACGA "ihC 



^uuo.mAt: gancccagca \c~g^ac~a~ 



77G7G GGACA CGCCAGC 
AATGANC33G AGGGG GTGGG GGA; 



GGGR'JGCGYY NTOJAAAACGG CCGVrWAANCO 



- _ ^ C^ T G 0 A A G G 3 1 J AA :G J C G 



V^C^,\ C:0;r3 " CAj ^ rYAA ^ ECOGTYCMAAA ASK' 
.-vR AAA ATM ""7 ~AI-«A'" V NS" G " ™"\tm- • . 



J ^ O _ _ 



GGG:rc:JA 



53SG GCGMARCGGG IIGrCAAKXWWA 
.^.i..vji<.. w .rti\ .huuu "77377NANG 
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rffplrf^fr ^r:;;^:: ^ CCAA C " G AA7CCA7TCG GCGTCTCACCT GCAACCC~CG 12 0 

AAAC^C.A C^CGCATG GATCAACC~G GCGACCGCAG ATCCGAAATA GCC7-CACA~ < an 

1°™^ GCGCCGCA - ACGCAAGCTG 2 " 

^;;±1:° r;:-:^!5 ^ccgcg ccgccaagaa ATGACGGTGC GCATTACCAT 36 o 

TC~C~TGGAC A-r^A^ ™ CCTGCG ATGANCAGCC TTATGCCGAG 42 0 

CGGCAGCC 3CTTCAAAAA CTCCTTGTCG ACAATSGTAT T&^TCANCCG 4fl. 
jrawr NTRCTTGCAA SAACACTNCA TGTTNCSGGT NAACAACCYT G^SSS 
ACANCCAATA TTGAANTCCC ANTCGGGCAM GAACCNGTTM CGGAAGfCGK ^Ja*^ 

£^^C ^AAAANAN ACCCCGCAMS NGGGGGGAAA ATTTGNAA WT 

^~CN AWNCCCCMGK SARGGGGRttY rTMKAGGGMC 3S0 

^ ^I^ K ^sngrcaat ™-tttgk psssrnkctt 10,0 



: :::fcrmat:cn ?cp gec - :;o:307: 

' 3 EQUHNGE CHARACTER T ST : GS . 

IB) TYPE: -ucleic ac:d^~^ 
' C ) STRANG EDNE3 3 : s i no ■ - 
G TCPCLCGY: linear 

- MOLZCGLE TYPE 3enc,;, GNA 

se^ye:;ce rescript;::; sec :g :jc:3o? : 
-G GAGCTTCACG aaagagc 



- — v-tAG g tg a a 



,, ZII" - --GTGAACTGG. ATTAGGGTTG 

■ ^GAAAGATG ^GACGGTT AA 2TTGCGCTT&!Lac~GT~~A; 



rT~CG^G~A" "™^^~T~2 ' : ' tAr^^^STG ' :, ACATCATCTG GCACCGTATG 3 00 

^Z^, 2^ ' GGGGGGTAAC GAANAAGTTA TGAGA T CGG~ • 

-^o.uAmt: gggcatscnc jcggcamt™ "gcaac~-gc -v^-t^^ - ^ 

rGAATG CGGC GGTYAAAAGC :. T GGC' T ^G""'" ■"^"—vMn^ ""^ ^ - ^' rM ' r ^ u ^- 4 2 0 

^ * * 7TMMAAC GN AA C G C! JTI I G^JATYC* 



:cg::gnmntg cgttctctcc ^^^^ g -4bo 

' ™T G G A :; TT A M P T7 ;™ " .AA.-. A AM C : 



--j-r.^- o o'v. , i'GAAAGGCMA CTNCCCGC: 



_ t " - -^^rt.GTN'GGP TGAAAiITAMM 

« . - r _ :g ; G Y ~7 G :G J G! : GG G ; <*m;T< ■ : — vn^rp ^ « ^ 

rGR*r^:r" w ir; "* " 

— V4 -mtt -y — ~-g:;uag gwcgggggkk 



tgmsa prgg:;gyggt- ■-■ ~~~~ % \ ,1T.,M^1 ^ r ' ' ' T " — '~- j ^cgg:<:c-iaa 

v*.'^rrrr/^" - :t,a - ;gg3 " :: y-a-paaacaa agggc^vra 

^^^; : T.;; T - ::s -' : ^^ ^kgwant^ct sgmsccmns?; 

•;aWNG'" ^^''^ ^!^'^' GG ''' ^ G;;C ^- -AAAGGG3MS GJCKG^NGN 

^r.,., u .^r,... ..,.:.G:-gmkp:; ^jg:w>;gn^; 



- - ■ -j i .j 



g; a g: j nm km ga w s m g 
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fD) TOPOLOGY: linear 

MOLECULE TYPE. Genomic DNA 
XI SEQUENCE DESCRIPTION: SEQ ID NO: 308 



AATTCGGCAC GAGACAANGG CGTGAAATGG GATCCGGCCG AGCTGGGGCC CGTCGTCAGC 

sss s= ss SSS SPSS 

2££S ™S — - ~ ~ -~ 



60 
120 

iao 

ANTCC-GATA r-TKr—ara """7;:^ ULWIOTTC CAGGACTGGT 240 

U ^ ibAiA CTTKGvj J.ACA TCGTGACCAA CTGTGGNCAA TATTCGGCrr ^r^r— ~w 
NGTCGCGTCC CGCGCGGTAA GG— Mtfrar TATTC^GCGC ^C.CCTCGTC 3 00 

^ •^---ANCAC TTCCTTTTTC TCGTGCCG 34a 

(2: ^FORMATIj:j FOR SEC ZD NO: 309: 

1.) SEQUENCE CHARACTERISTICS; 
:AJ LENGTH: -32 oase oairs 
■B; :':?E. ::ucie:c acid 
STRAWDEENSSS : siaaie 
iOPC^GGV: 1 inear 

X1; SSQ ' JSNCS =EscH:?Tr- I: 3- sc. 30?: 

AATTCGGCAC GACAGACCGG G7GG7TGAC" ^\CGCAC ~~ 

^ ~.rt\j^^ CTTCTCGTTG ""^^GCG ' , T ,_£„ rr , „ ^.... - ^>v, ^ . 

=:= C3GA7G " ---tgtaccg cacaccaggg aggtccttca : 9- 

TGTACGCCGT GACG~CGAAC — AC ^— G ~j^. wV _^ . CGCCGGGAA 24: 
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.-.WCYTTKN NA..KGGNNA AAAANCCCTY rCWCSCRACT MCCZCCOIGM GRGMCNNTNN 
NTT7YGNCNN CCCGGSNAAM RNTTKATTTC NGGGC3HTCN GGGTKMNNNA AACCCCAAAK 
MNRNNKC5CA AMGGGKSNGC NKNNWWSGT TTTYCKNMRA MRNWTYKNKN MTCMGARSHN 

maamotosmk ngkkknnkaa arnnttwktm knscnnhcnn grrngvrggc ckmkgsnmng 

MCWHNAWRNG NNGSHCNCKC NNKJOJAAAAA AASGGTOCXS NSKKNKKKKG TOGGGGGGGG 

(2) INFORMATION FOR SEC ID NO 111; 

(1) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 323 base pairs 
(3) TYPE; nucleic acid 
(C; 3 TRANCED NESS ; single 
(O, 1 TOPOLOGY: linear 

11: MOLECULE TYPE: Genomic UNA 

xi -^o ,t f:?ce oegooiftto:; : j "0:3ii: 

AATTCGGCAC RAGAAGACGC CCGAANGTTT GCGOTGCCrG TAGAAGTTCA TCAARGCGC^ ^ 

GGCOGAAOGO AACTTCGGCA AGATCTACGT TOGCTTCGOG GAAGOGGTCT GGATG^G^ 

GTAGGTGGGC GO AOGGCA0G GGGAG0TGA0 00AGGATCGG GCOGOGAAAG GGCTTGGGTT 18 

^^^^^ TC^TTCGAGG TGGCCTGGAG GA7TTTGCAN GCGACGCCNG TGAGGGCGAC 2 4 



Jj4 case ca: 



TOPOLOGY ; : : 



^20 
78C 
84C 
900 
960 
9€2 



_ _ V ^_ T ____ -O,. .^WCT GGACCAGGTG 300 

2' 1 INFORMATION FOR GEO 




WO W42I18 



21)0 



SCNSNGGK3C CSCC 

12) INFORMATION FOR SEC 12 NO : 3 1 3 : 

:i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 331 base pairs 

(B) TYPE: nucleic acid 

. (C) STRANDFDNESS : single 
(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: Genomic DNA 

;'xi' SEQUENCE DESCRIPTION: SEQ ID NC:1I3: 

AATTCGGCAC GAGCCCACAT GGGGGGCZGC TCGTTGCATG ACTCGTTCGT CATCGTCGAC 
RAGGCACAGT CGCTGGAGCG OAATGTGTTG CTGACCGTGC TGTCCCGGTT GGGGACCGGT 
TCCCGGGTGG TGTTGACCCA CGACATCCCC CAGCGCGACA kCGTGCGGG^ ^GGCCGCCAC 



info-madid fop sog id >;: 

SEQUENCE CHARACTERISTICS: 
A) LENGTH: 122 b case pairs 

.3) TYPE: nucleic acid' 
C! STRAND EE NESS : single 

, D/ TOPOLOGY ; linear 

: tfCLSCULE TYPE : Genomic DNA 
: SEQUENCE DESCRIPTION. SEQ 1 



. G A FOGG Z Z AGAAG AT G C 



_ CA.GGTGCT GCOCATCCCO OCCACCCCA 

™^!!*A = 3C3GTC — ~ A — adggcga dragctogto ttoggcat 

70AC0CCGCG TCCGAAATGC AGCAGCTGGT CAACGGTA 
30TGGCGCT0 ANCGACAATT OOGTGCTGCT GTTTACAAGG ATCCGC_ 

;^ CGCCG ~ Grr ^ 5GGG ^ cggccdtctg tocogac 



jva/\CTGGGTG 



, . 0 0 0 3 MM AA A T C C C 0/0 ; M 0 AAA 0 00 00 



J - - jjw^ jOOGGGNGGO 00 0 ON AWN 02 

™^ll3" T TT3K - ;c?:r: ^.'ojggonco ncnaanscan iootttfogo 

- ^ -^rt'TT . .AA 2 0GAH0G3ON AAY3 0 0AAGY IMMGKOCYCY r 0 , T AAAAAAA A 

■-COCAANTAA ATTOOONGGO COYTTOGGGG 2GRANCNYNT TTTMC03NS0 

:;::naa*c ncgancodgo i2aaytmmtxg naayo^ccs:; aat-ibnti-itc taannccccn 

'tGGVT"G:" SGFFGRAGGM Av\AAAANP 3 *« 

- ■ " r\s\~*> A ..j , _y .j- j: j j . : 



.igaaa attnnamaam dmnnktgsn 
:j:o; sanmncnsnn sggngnnnn 



12 0 

130 



" " — j^rtAGC.'OA AAGGTCATCC GTTGTTCGCO CACATCACCT Z'iO 

\G TGAGCGCTCG ICGATCGCCG CGOTGOTOAO GAGATGCTCG ANG AG AT OA 0 3 0G 
3*0 TGAGTGCGOO TOCOOCGAGO A 



2GA0TA 13: 
jCGTOT 24 2 

:CATOO 3CC 
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(A) LENGTH . 3 24 base pairs 

(3) TYPE: nucleic acid 

(CJ STRAND ED NESS : single 

■ID) TOPOLOGY, linear 

■ID MOLE COLE TYPE. Genomic ON A 

£xi) SEQUENCE .DESCRIPTION: SEQ ID NO : 3 1 S : 

GAGAAGACGC CCGARNGTST GCGCTGGCTC TACAACTT CA TCAARGCGCA 
^1°J? CGC ^"CGGCA AGATCTACGT TCGCTTCCCC GAAGCGGTCT CGATGcSS 
-ACCTCGGC GCACCGCACG GCGAGCTGAC CCAGGATCCG GCCGCGAAAC 0G CTTGCGT^ 
: C - TTC3AGG TGGCCTGGAN GATTTTGCAN GCGACGCCNG TNACCGCGAC 
^...Ku^ — -TCACCAC ECGCSGCACC GCGTTGACGC TCGACCAGCT 



informal—; for gec :l :;c:3^ 

: 5 EQUEN CD J KARA CTER IGTIC3 

;aj length: ;j:o 3ase oa , r3 



(3/ TY?r 



acid 



;D> TOPOLOGY: linear 
I) MOLECULE r v °- - G* ar """ 1 - 
i : SEQUENCE 3 OR I ?T i ox ■ ~- 



^0 GANGCGTGCE GG^NAA 
-»^- — juTGGC 3TG 



>. -jo ECCNAAAT' 
jGT 3ACCCNAA; 



.:>'or\ . ^Tvii jAC 



^CCEGCGGC TGCCAGATAT COCGGACTCG 

. _ - GACGGGGCG OGGCGACCAC AAGGTCGCTM 

"IT^^I^H ^^r?lIl"^Z - - - -AT LAG Y GAG GAGGCGT AGGACAAGTC GATCGAATGU 

" „ _.\^-o^rtAA 

- - - - K "" ; ' ■ — j _ jo j t \ 2GAAT 3GGAACCGC, 

- -''--^ JCG 3ACAGTG 3TCNACANC 

- • - ■ ...jACGU 'GGCCCACNAT AANAACGGGG ACNACAATCG 
CANCTTG5C ATCGGATTTT 3TCCCCAN0G Z^OAANCCGT 
::;^; T :|MAWTAACT0 OOGCTTCCGH ZCCTGGNGCA 
_.i^...Grtrt jGGGTTG.,0 N'ATTTTTACT jGTAArrrco 

-L'l^"^' ' " •■ y*> j NGG 177 AAAAA jGG3KNNIC7~ 

I L .. ' ; ' A AAA'/.' AC OMMMMYGN "-'-.^'v^ <- 

,7^*;3 ^,„. ^ ^" 

^'..^-^.rjj.i,, NuAAGMGGGE 



'"ttgcnnaa wm;<;- 



6C 
12 0 

:ao 

30 C 
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(XI) SEQUENCE DESCRIPTION: SE r 



MC : 



AATTCGGCAC 
GTAGTGCCGC 
ATGCCCAGGT 
ACCGGGAGCT 
ATAGTGGCCT 
CTACTCCGCG 
3AACGGGTCT 
AT AN AT CT GG 
ZCGCCCCGGT 
CCCGAACGCC 
WTAAATGGGA 




GANGCGTGCC 
CGGTGGCGTC 
AGCGGCC GAG 
TGG CAT CGGG 
CCAGAGTGCC 
TANTGTTCCC 

ganctcaggt 
cccnaaatcg 
cacccnaaca 

TCNTCCGGCG 
AACCCTTNCC 
GANTCGGTCN 
S YTOAAGGGG 

NMMRAACTKN 
WMKXGKNWNM 



3CTNAACACC 
GTTGCTCTCC 
GTGCATGGAG 
CCTGATCAGC 
CGTGCAMTTC 
GCATCGCCTG 
TTGCCGCTTT 
GCGCCGACGG 
ACANCTTGSC 
NACTTTTCTT 
YCAC CTTGAA 
KCCGGGSTTT 
GAAACCCAAC 
' -AMGGSGGTN 
. GTICTCNNCG 
GG GG CCS SG A 
GNMNSCSMGG 



AGC ddCjOCjCjC 


TGCCAGATAT 


CCCGGACTCG 


60 


TGACGGGGCG 


CGGCGACCAT 


AAGGTCGCTN 


120 


TCGATGATGA 


TGCGACTCTC 


CAGCTCGCCG 


100 


CAGGACGCGT 


AGGACAAGTC 


GATCGAATGC 


240 


CNGCGTGCTC 


CACGGCAAAT 


GCCTTGAm 


' 3 00 


CGGGATGAAT 


GGGAACCGCA 


SGATGGCGAC 


360 


GCGCACAGTG 


GTCNACANCC 


GGTACTCGGC 


420 


CGCCCACNAT 


AANAACGGGC 


ACNACAATCG 


4 80 


ATCGGATTTT 


GTCCCCANCG 


CTCAANCCGT 


540 


NNAWTAAC TG 


CCGCTTCCGK 


TCCTGGNGCA 


600 


ou GG T"TG TT G 


NA IT" it; ACT 


GSTAACCCCG 


660 


YSTNTTCCCC 


ACCTTNGNAN 




720 




AACCSCMNAA 


! 4YMTTTY C S G 


7 30 


AACCGXTMNG 


NGGKTAAAAA 


GGGSKNNXTG 


6 ■! 0 


^ ^> C CXAAA W 


ACCMMMMYGN 


TTGKKXNKSS 


900 


* « *TTN AAA G 


mscccccsnn 


::-stgkcccnn 


r*6 0 




NNAAGMGGGG 




1 0 I G 



o v 



NC 



SEQUENCE CKAPACTERISTIC 
;A) LENGTH: 10?: oase oa: 
3 TYPE: ::uc>:.: ^c:c' 
; C ) STRAND ED NE S3 s : nq 1 e 



TOPOLOGY 



fa r 



X0LECUDE TYPE: 
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, 3EC T GENCE CHARACTER Z 3~I CS . 
:A LENGTH . 1251 base pairs 
,3. TYPE, nucleic dc-ci 
C! STRANDEDNESS : single 
..Z ■ TOPOLOGY, linear 



PCT'l'SPQ 032 



.i- MOLECULE TYPE: Genomic DNA. 

ixi;- SEQUENCE DESCRIPTION : SEQ ID NO:319: 

GGGGGGGNNN MATACATCWT CYGTGYACCG GGG MTCTAKT GGCGGGZCGC 
ASAGATCTCT N AMTTC G GG C ACAAAAACTW GACAAASYMT CGNGCNMTC" 
TCGCAAAACG NGTRACASAC ASACACRTAT GTGTGCCCAC CASCAA^CK 
GCTRACCGGY TGCCCRNACG CCACCYTGC3 CWTCTATCCC RAC3CCGGCC 
ATA^CCAGG CACCACGCCC AGTTTGGTGG ACAATGCOGT GGCAKTTTCC 
T3AAACCGAA ^CNSMTTGA ACCNC CAARG ZZZZSXCZim AACARTTGGG 
jlf™!"™? : * 7rT - CC3GGG 3TNTCGGCAN AANCGCAC CC WTGGWTTCTM 
.j^jm. ^. m * " ' - .-.A..... - - -'-A -i: ^auuGCTG GGAC^CCSCA 
J ~^~"™""' VCRAAMAc:: - 3GAKCCGCAA TTTCCGGGCR ANAAAT""TC?J 
3C . . RTAC7*T ~CCCGACCGT AACMAN TT T C ATCGTCNTNN CGTCTGCCCT 
Z KAAA Y A C G G CM. . KGGTTT CGCAACCTGC GGCCGAANTG CCNAvr— 



RAN AA CCS 3 3 NTGGCCNNYT 



.jvj.noS AAAA 



:j " scc::::: A 2Z AAMACCC ' :ACCN ^^ cawtctttgc gaaasttkgg 

;!?!!TIr^* ^;;r rrrTG3 GGNCNCC - r:: tatnggsntn gggcckcync 

03 NGNKGGGGGT ACCCOCCTMG GGGGGTTTTT NSSGCCCCCC 

3 G AAKAA TWT MWWTOCNSGG GG GAAWTTTT NTSTOGAMC3 

^.,_ w TOGGOOIJCSA NNAWANGG-3G GG GGG AN AYT NT3N3 GNGGG 

. . i'CYGGTM T KA CXSGGGG GTTTKKAiCNG G G GG GAG AAA AN AAAAAAAA 

GrCICACNCT GI^VrNWNWATTR NAGAGKTC CT CX2XCCNC3C SN — ™ 

:;ngnnnaaa acnkgrmmac :-:csytycc:g ggyctcctcc *:chgggg^~ 

thtmoncotn gcctgcnccc ^cknkntgtc tmtcnmyggg 



3ktt* 



aatctngtca 

GTGTCCTNKA 
TTGGGACCTC 
m.G*j\_jG ^gggg 
TCRAANTTC3 
WTCCGCGGTT 
TCNCCGCACC 
•-u-iC GGG T G ', G 
YCNCACCACT 
- 1 jGGvjGAGvjG 
CTTTGNATTT 

ngggggct:;t 

AGGAANSfCTG 
UCSTKTGKZ^ 
AWA YGNK5TG 
S GGAC Y G G Z R 
KWNTTTATTT 

rakggykntt 
mgnsg3yggg 
ngscgnsty:; 



60 
120 
13C 
24 C 
30C 
163 
42 0 

3 0 
= 4C 
600 
6 6 C 

780 

34: 

o b : 
: 4 0 



- — v — — * < w ^, _ nARAC . E 3 7 3 . T _ 3 
A LENGTH. ::?9 Odi3- — i ■ - 



. — ■ ^ _ ^ w \ 



i-^sjGG CAAAC3"' 



ANTC3AACTC AGCC3GON> 



AT OAK C 
3GCMC3 
' J AAC C^J 

AAAMGA 
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C CNWATCTGG NGGTCCCNAN KYYGCCGTTC 
AACCGKNNKG KCCCCMKCTK MANAAAKATT 

cncyncwytc tmycsskwgc gcsmynanca 

MGCGCCKNTN TYCKSGAKAT ACASMNKTCC 

ccsnstmtyn ctsnnmkmnn tccwmwnatc 



7CXCKCACRG MTMTCWCC3 

(2} INFORMATION FOR SEQ ID NO: 321 

(i: SEQUENCE CHARACTERISTICS: 
:Ai LENGTH: 296 base pairs 
: '3) TYPE: nucleic acid 
CC; 3TRANDEDNESS ■ single 
f D) TOPOLOGY: linear 



NMAATSAMNA 


NMNRGGGTYT 


TSCYACCMMN 


-T -> r\ 
, u U 


RATCAMKWNG 


3GNKCKCNCN 


NAAMACCSCN 


^80 


SNGGGGAGGW 


GGSGRMKMCT 


CTMTCTCNCT 


340 


GCGCNGCGCN 


MAAMANRAKA 


CTAKCCGYGN 


9C0 


NTYYGKKCNN 


KCTMKATNWC 


CSCTSKCNCX 


960 


snmskntckc 


KSCNCCNCWN 


CNKCNMKCWN 


1020 


WACNCACACK 


NGWCTYTTCC 


WKNNMKCNKM 


1080 








1099 



lencmic LNA 



.xi, SEQUENCE DESCRIPTION : 



" OS AG GAT C W AN.GC::i:;0CG >!AAKCTWSTK CAGAGATCT, 



2; INFORMATION FOR SEC II 

SEQUENCE CHARACTER I ST I G3: 
A: LENGTH: 10"^ oase -air. 
H TYPE, auc.eu ac;:" 
3TRANDEDNE3S . : nq ^ - 

■ ' 'IOVCLOGY: .:-ar 



NO 



AAAiT^GCA MGAGCGGCAC AXAKYSTCG7 CCMRACCCGG ~AYACTCCWC -0 

^ c:<A7ASMC **»™cr ^tacaccac cacsc™ :; 

^^^^L^^a TRAM C AAA C C ^CC~CGC™~ — _ 

CCCCACCGGC ACCAC^GCCG ^TT^^ j^-^ul^uGGu --^CACCAG 2 4 C 



:na 




WO 99 421 18 



j> rT rS t)9 0 326: 



AGGANGGGNG ZTZ'ZZZZTHC* WGGCTCGC3N SNSMAMAAAfl NKGKGGKGG3 GKGARRNMN; 
MCTGSNGNGG WTGCGKNKTG NSCNSGNCGS i'GGNSASWCG YNYGNGGAGA ANC 

iC INFORMATION FOR SEC. I- NO: 323: 

(A'; TENGTr • 1166 base pairs 

(3) TYPE: nucleic acid 

(C; STRAND ED NESS : single 

(D> TOPOLOGY: linear 

ii ; MOLECULE TYPE: Genomic 3NA 

xi : SECGENCE DESCRIPTION: SEC IO NO : 323: 





TTMMMTTCAY 


TCATTCACGG 


GGMTCTAGTG 


G GG C C G C AAK 


3T7GTCXACA 


63 


.3 AT CT G G AA Y 


T G G G CAM GAS 


.-A ^ — nn * w . ^OU 


G T KG GG CAAT 


a i CGJvjo * jvjG 






, ^ „ ^ „ p^^^ 


- v_ O O O vj - T.-U-i 


wJ^wJUU. ^ - 


RAT jGGTSTG 


jGTAjaTATG o 




* rj ~\ 


^. j>j CaG G 


~ AGAATTTCG 


_rTTT G G^Aj-v 


* ATGGGTGT 


JUUV- rtJ-i x ATI J 


_j GGTYCGGTA 


21 Z 


ACA G G G 3 GAG 


TGG RAATTY G 


GGTATTS GG T 


NAG Z GG TRAY 


AAYGTGAGGG 


-tSTNCGGTG g 


3 0C 


TTY GAATAGC 


^ GTAA _ GGu A 


ATGTSGGTTS 


YYYACYCCGS 


G G AAG GG W 


* T f ' ' W V V * f 


3 b 1 


T V. M G N C T S S N! 


C GKS AAM7SM 


KMGCTSTYCT 


MTYCNNGGAS 


TAMTYNMCCC 


G IGWAYCKSC 


4 2 , 


WAY 3CCTCGT 


CATYCOMGMG 


S G S G Y C C C G A. 


MNCIACCYTG 


NGYYCCCTCC 


MXMTCYCAYT 


4 8C 


IMNTCCGGTW 


CCTUTMMNCC 


C3CNCRYCTC 


AMCNCTKSCK 


CACCNATMYC 


GGACXCHTGT 


54 0 


MCYMCSCAKN 


MTTC3CCTCN 




MCMI3CTCTM 


TGMAACTGKG 


3CG0YCKCNC 


60 : 


*^ v"7"TCXCC 


A YNMAA G G :G\ 


T/r^CNWYG 


YMYCKCKCAG 


WYKNMCTGCW 


A G T GTMYNT T 


6 6 C 


TCTCTCNKCC 


CMKA3 3XNTT 


CTCWCSCCCC 


CCACAKAYMC 


YAW C Yi TV* TGG 


MCTCKACSCC 




IYYCNNYCCM 


NMCW3MTCWC 


twnaxcancn 




MM YM T MA G X G 


WCNNTCNCCX 


"90 




AC T KM KG C KM 


T ZTG ITTMCX 


GG YMWCNTCC 


MKYNCCCTCC 


NMTCMTCKYT 


84 3 


ictoncnmry 






yy> CAK C T K CT 


3CCCCAKMK3 


ACNCKCCCWC 


9CG 








::r/iMYMKMC 




cnactmnmwij 


^ 6 J 








ITYCKCNYMC 


NRWCTYRCCT 








jtcc icwmxm 




,'TMM X3TCTC 




'NKCCYNYN Y 














■'•.'MTTACWN G J 
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^CCC^ZZZZGZ GGCZZZGZCX AAAAACCACC AATYCCGYTG GGGG7GKYCC CMCAGGCSGT 480 

TGCTYCGNGY CACCTGGCCA AAYYCCCAWT AXATTGGGTG SCYCKTSCGG TTSYTGGGCY 54 0 

GAATTACCCC CNCGGGNAAA GRRAAAANAA ATCNTCCNTT ~GCTCGGYCA YC3TTTM7TGG 60 C 

SAAAAGGGGC ATGGCSCGGT TYYTTTACCT CAAYCCCCNA NCANTWACCT YTCCSCCCGG 66 0 

GGGG U CAN AA CGSTTNGCTC CGSGGNAXCG TKGTMCCCGN A T CN AAA GGC ■CNGAATTTGG 72 0 

7YYSS7YCNA A TTWTW KKKY CCCCWCNTTG YAAAAAKCCA AAASAKCCCX YCNCAMMYKT 73 0 

MGGGGTY3SG GCCKNYCTTK SNMTTAAAC Z CYCCCCAAAA YYNSGGGKK7 TCCGCYNSAT 94 0 

KCCACCHCCK GNGGGGGGNA SAAAAAAAAY TTTYCCSAAA ATCCCACCYY TCYKTKSTRY 900 

'AMACCCCCTT TYYMKKAYTC CKYGCNATTC SGMTTCWAAA TYCCGYCGCT TOTTCCCGCK 960 

CSGGUGCCCC AAWTTTGKTT YNCNANTTYC CCCNAAMNCM AWTMGGGGKS KCCATTCTGG 102 0 

SCYTMAANTA AAANAANGGG NKTTTYYCTY MANAAACACN GTGKCNCNCN OJAAMAAASN 1080 

AKMAAAKAGN KKKMTKNNSA AANCCNCCCC CTSTYTNYTT NTCTNMNCKCC CYGGKKNKGM 114 0 

3 W S WYNTT CT XCZZRCCCZZ YNYNKTGANA AANMNCYCCS GGSTMCRNAN A3NMNTTTCK 12 0 0 

•3™STXGy.GCC KMBASNANAN MCAMWKWYCC 123C 

2; INFORMATION FOR SEC. 10 NO: 3 25: 

: SEQUENCE CHARACTERISTICS: 
A ,. LENGTH : J 2 2 case oairs 
3^ TYPE : nucleic acid 
C STRAND EDN'E S3: single 
.CO TOPOLOGY: linear 

■i:: MCLECCLE TYPE : Generic DNA 

. <: 3EC/JENCE DESCRIPTION: SKQ ID ^0:325: 

^GN'GGG fcj?;a TWAYCWTC7C acssggtcta tgcggcgcaw CTMGTMAASA gatctcmaay Q c 

"Z^ 7 ""^™^^ 3CATTvrTC>!MC CATATATAAC CATTGCGTCS GYWTSCAWCT CRAAWCTGTC 12 0 

"^..23 KG 233 TTKTACRAAG GTGGMWTGYT CWTYCC7RAA SCCCTCRATI TCKTKTATYC 18 G 

::t:-:gggctyc actt^aacsg rat?:sctgcc ttktaycatt ratgsaawta wtggycrawt 24c 

■™3CAGGCC ?A33GC r .*YC~ I™YCCCC?A 3P.ACAATNGA TTGGAWTCGG TYCGCRAGGC 50 C 
733 3CACCAR AC33GGCNC3 AAAGGYCCGC 3CAAWTSCCT JGKT 2AAAAA TGG 

vaamc::atcc ocggyttrac cggagytamc acaakaaaa 1 

. tc2ratcv.y 2wyccccacg ct3aa0~tgk ytgc3gtatt 3cctkcctgc ctcracaggm 48c 

. o:;ccc:™CA .aasctgsggt gactgcaact ggtctggycg aasgggggyt /wmcggacaa 54c 

-^accccran:; tcgccaaatt ttc::cocccc ^/ogggaaan ircrGATr-rrr" ocsi-jaaogsa soo 

3XGGGNNY7VJ NAACCCTGAA C3SSG3NK3A YYN T SCC3GGA AITTTTTCTCT ~3N r GGGG3R! 
AAANCCTTT7 AAGGTACCCC KGGNGGGGKG 



zgzz g ca c "A wnt IT 4 2; 



^-r\ TT C* I G G G G GG 

i _ *j _ o i^j jAn / V C GG 3 *** 

TNNGNRT" TTrRGGGGN'MT 



D b u 

75 G AAAACAACCC 2 KA'ITGGjCTT "20 
V A ■ ; " - — \* ^ - ~ } ; >" Y 3 "7* 3 3 N A F ' i 4 



information for sec ::: :is 



1-33 c a 3 o ^kt. 
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(y:J SEC/JTNCE DESCRIPTION : SEQ. IT ND:32f • 

rrNCGNNKNTA TAMAYCWYCT NCACCSGGGA TCWATTGCG« GCGCAATCT7 S7MAASAGAT 6 3 

CTCKAAYTCG 3CAMGANCCG CAVCTA7T7G KGTGRASCGC ACCAGCGRGA CCTCGCSGK7 12 0 

CKTTYCTTGC AGRGAGGCCK TGGGTGGCRC CGGTGGCAAT GCCAACCGCC CCCCAAAACN 18 C 

CCGCAAATKY CRAAAAACAA CCCSGGG3TA GKTCCSGGCC GCCAAATMAA TAACCGTKTT 24 C 

AACKCAGGCN ACGGCCAACC GGYCCCGCCC AAC C AAG ETNA CCTCCCCSCC KATAGGYCCG 30 0 

GTGGGGGCTG CCKTATYKCC AASTCGTCAY CTCNACGGGM CGGYCCMCWT TCCGCCTCAT 36 0 

CCGTCTCTCC TTMYATTTTC CRTCCACYXG GCGGGGAACY TTTTTNYCNC CCTTGSCMAN 42 0 

CACCNAAGGY CNAAAATTNC CCMTCCCKYG SNNCAAAYGR GATTGGGGTY CGKXTTTTNT 480 

TCNMCCMAAC CCCCNTTTNA CGCCCCKATC CCYTWATACC CCCWWMCTCNS ANGKTTGNSA S4 0 

AAKTNNCCCC AAATRCCAAA MTTCTTCGCC NTTTMTWKCY YYCCTTTCCC CMCCCVTNTU^A 600 

GGSCCRCCYY TCGGSAANTY TCCCCNCAAA AWTCAMWCOI TTTCCCNCCA AGAAWTTCSG 66 0 

SACTCCTTTN TTCNGGGNAM ATAKATYYTT YCXTNGGGSK TTCCGMTCNC AKMAATNTCC 72 0 

RGGGKAAMCC AGKNTXCTTCC YYYYCCCCAA NN7YCCYKGG RMCYNNYYCY TTAAANRAGR 7S0 

S AACCCKSGG GKCYNCNCSS TARCCCCCAK KAAAATTTCC CCCSSKTTTC TYYNNKKXRW QiC 

GCCCCCSAAM ACTMTWAYTT TCCCKCGNNN TTTSYCCKC5 KCAMWMWMTG XXNCTTTTTT 9 0 C 

YCSCMATAKA CTTN3GKCCT NTCNYGSSCG CMAAANAAGG CGCGSTTCTN T TC VMAMACA 9£0 

YNTSGNMMMA SAAKAKWATA A WNNT, RXXYX TKNNCCCNCC CKCKCTTSNN TNKCCMCSKS 102 0 

G G KNVN KKH C'.^CTCCVCNC CKCCCNCKk* CCKWATMCCC CCCCSKCCGM NCMMN7T7K7 1030 
CCC 1083 

(2) INFORMATION "OR SEQ ID NO: 327: 

(:! SEQUENCE C-iARACTERISTICS ; 
(A) LENGTH: 1C69 cas- pxirs, 
CB) TYPE: jtjcleic acid 
fC) S7RANDE0NESS - S X ng L t 
<D) TOPOLOGY: MNTAR 

(::) MOLECULE. TYPE : Genomic dna 

Ixr) SEQUENCE DESCRIPTION: SEQ IDNG.327. 

GGGGNNKYAT KCAYCWTCTS YACSGGGKNC TATTGCGGC : G C A WYTN G TK CAS AGATOTC C O 

GAAYTCGGCA KGAAAAAAGW GATGTGCTGG ACCTTMCCG: GCGGGACGCR AC7RACAAAG I2C 
RAAS CGCGCC ANAATATTGG CCACAKTTGG TCACATATTT ACCCAATTTCT AYCAGGGAYT 
MCCATTCCKG GGACCRACC3 CACAATCCCR ATSXTG-3TTT GCRAA CCCTR AGCGI 



4 ;= o 



: 9 o 

MYTYCGCCRA STTGAACCA^ G^CP^ViW. CGGCCRAAWY CTCGCCCTGA NTCCCGCTCS .13a 
GCGCNAATAA CTAGGCCCAT TKAACGGAAC CGGNGGCCGC NANTTGGCCA ACAGGTCGTR 36 0 
ACAAAGGGGC CCCASYYCGG CCGGVTCCCVf TTYCACNUCC TNr' T C T CKTG CCGAATYCGG 42 0 
WTCCRATNYC CCWTGGGCCT TYTC'/YCK.Yr KYCGGTNZCA AWTCTNGGTA TNCTATRGKG 
TCCCCTAAAT SCAN AT 2 TOG GCKYCCATTT NCTGGSNTTC NATTTAMMAN S P-RCGGTTCT 
TTCW77CCRA AACCG5N7GG GICCNNMCCA AAA AAT 1 >AT.\ ATAATAAT G r'. YGGCTTTCAA 

accccgcccc cccatt:rwt csgttccanc ccc2ngn-ggt 7A,v,r.-^ .a at ny-;;;A>:: 

i CT».'u'i»o i « — N^\TTTCGG>JA A\AA CCV C V I G 3 G Y C 7 : !l"vA-^ CTC.'YTTTTTT G S K ^ NT 7 '"-^ 

cctckTt:sc ov^ac:-ov\ a t n ri'iiri g g ggyccktt;^ ao^ggycrc rc:ggaaatt 
i TTY7GG7* z a.\cccc.^\cc TT7TCAAGCC NnTTYrrr: , :'kcc3>:cs>c; tn-'-;ssgggnt 

:-^SC'JNTTCY K/1KKKCCN.V-N G-^GGGVYCYN CCCCRKNTT7 CTTTTTTTTT CCGTNNVAV^ 
NGKTTCTTCA AASMCCTCCC SCCCCLN5Av\ ACCCCCTirAR. G TT T TY CMMA AA-N'NTC^GN 

Ky:cccccccc m*;aaaaaay ycoCccgnw acsmsn:>^a m'jcccj^j^; ^trktt-.tt 

•V:C^S,GYCCC TK^V'JvreK GAMN5^TTTT TJ^.-KG^*Ky 



WOW/42118 



208 
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<i) SEQUENCE CHARACTERISTICS: 
(A; LENGTH: 12 10 base pairs 
(3) TYPE: nucleic acid 
;C) STRAN2EDNESS . single 
;D) TOPOLOGY : linear 

MOLECULE TYPE : Genomic DNA 

(xi } SEQUENCE DESCRIPTION: SEQ ID NO: 328: 

NGNGGGGKWK MATACATCWT TCTTCACGSG GGATCWATTG CGGGCCGCAW TCTNGTMCAA 6C 

3AGATCTCGA TYTCGGG CAM NACCCACCWC TCCRAAAAAA ACCCRAAWCT CGGGSKCTYC 111 

GARAAGTGTT GCCCGCKTTR AA TTT AA CAA ATTCAGTGTC ANAGTGTCA2 GGCXTTACWT I B 0 

YCCCGGCAAA GGGGCCA2AA CCTGCAGRGA 3CACYCRATG GKTGYTGKT3 UNCGGGCGGG 24 C 

2CGGKTNAAG GGACCTGCCT GGGTKTGCSC TMCAAANATC WYCCGCGGGT V CGCTGGRAT 3 00 

MCNCAGGGGT GTCAAAAAAl. 2GCAAACAGG 2 A CSC CAN C C NTTTA2GGG3 2TTAAAANGA 361 

.-•A-f-vAvj _jvjs_ G . \. i o w ■ v jo ^j^j o ^ ^_ . i . . v j r* _ A_AO C 1*1 ^Cl -2w*»^.-i * ll^o . -t^„ 

itctc"tgcc raatccgrwt ccsatnycnc cwtggccttx tckyctyit: iggtacccaa 46: 

.-iTCTG jGTAT v.C±ATnS i j. -CCCTAAWTT 2CAAATCTGG GCTGT 2CAI I TSCTTGGCNT 54 2 

TCC2-JVATTTA CCAIJCAAJGJ TTTCTTNCAT N'CCAAAAACC GNTKGGCKCC NRACCCRAAA 600 

AA-ATG^TAA TAATAAiJNGG K2:^;TTY2!;a AC3N2CCCCC CCCNATT2CA TYSNGTTCCA 66 2 

NMNCC2CCAG NGGKTAGGTX GGGAAANYYC TCMACCYYCA ANCC2TWAR2 TTTTNGRAAT "2 0 

KAAAC 2CTYC YCNGGGTTWW TYMAAAAAMA :rTTATTTGGN NGNTTTCGGG MWNCXRKNST ""3 0 

GCCAAAATCC MAAATANTTT YYT G G T Y '22 J A T W AAAAAM C G YGNCCMNCC 2 G G AAAA W T T B4 1 

ttntgkttsa accccaaaai yttttcmnaa ncss ktttty cyttc 22222 amnwtgggys 9co 

jggnaikgyg scytntctta tktxytymt* 2mggggggnn mkmtcmm2cc ccmtttyycy 36 0 

NYWRTTTTTM kccccxtnmp :;nraa?jnggn YTC3YNANAA aagcnccccc sc ckncccna 102c 

AAAAW2CCCN "*JNT I A R A KTIJ""' TTMKANNRMN ^CKCNKNGXY YC2C2C22W2 YNMNNAAAAA 10 81 

AATMY2CNCC ?ASA;;yCASY N'MGGRGMrS" '"C^CCCCSTT WNNTMTTNT TTTTTTCSRA 114 0 

:-AGc:-".2cscG MM?;A::yr2:c v — — v- :: :;:;g;:ng:;gnn ogngmncx22 iciiagaamw?' 12:0 



. rt ' ^Liij w". . ^ j r OdSC ' J d 1 1 . 



-•sja _ — . r.^w v. ■ ^ \ „ 2 2Aj^2''' 2AA A22-C- . *AK2 2AAGRAWY . 1 2 ?. .. 221 222 2 2 4.. 
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ATTTCSGRAA SAACCCCTTIY 3CCGGGTTT7 YCCT ~— CMG GC2CAANACC 3CCGGGAATC 66G 
AAAAASGGTC GGNCAAANGG GCMAAACCCS 3ACCCMACTT WTTCCRCTTN G GGGGG S CWN 7 
CCKNGTTTAA AWKSCCTCYY CT3GCCAAAV TCGGKCMAAA NNGRKTTGGK TTNGGCNACC 78C 
NTTTGCGGKC CCGGGKGKGK WGKYCTMNMA CS TTTNTTTT SCCCCYKAAA :JYSCC7CCCC 84 0 

r3G3SGGGC3 CGCGGGGGGA NNTTTTTAMA GKXTY CCGGT CCCCAMAAAA ANACC7CNYC y00 
CCSGGSCCGT TTKRWAAAMN KCTSCCCGNG GNNGGGGKCM GG KTTATTMT NNNCCSCCCC 96 0 

TCCGCGSAAA AAATAKMTTT SYCCCCCCNC CTCCKNCKNR GKAMSMSCGC TCCCYCTCNC 102 C 

GCNKNTWAAM ARSNCCKJOT; CCNCVKCCGS NSKGKCNWCD NCCSTSSNCT NKGCNCKNCN 10 8C 

KAAANAAYNC NGSMSTSSMZ^ CNKCC" : ^ 05 

(2) INFORMATION FOR SEQ ID NO : 3 3 0 : 

:i) SEQUENCE CHARACTERISTICS: 
«A) LENGTH: 336 base pairs 
f3) TYPE nucleic acid 

3TRANEE2NESS : single 
O.) TOPOLOGY : linear 

MOLECULE TYPE : Genomic Ci:A 

[Xi ; SECtJF.NCE DESCRIPTION : SEC IT "7:330: 

:;gs nsnknnn tamaycwyy- tscacsngga acwantscgg ccrkawct\ t s tmkasagatc so 

TMGAAYTCGG CAAGAGCGGC AAGAGTGTGT GCATCTGGTC ANAGTSTMMA CRCGGT3CC0 12 0 

C3GGT3KGTR GASCACMCAT MTGCGRACAC CAAAC CC!\TC 3C0GGYCAC2 CGCKTCGCCT I3C 

3CAAAWYCCT CCAGGC3AC3 T C RAA C AA Y~.v 7CTYCTGCAA CG CARGCCGT TYCGC3GCCG 24 0 

RAT 7CTGGKT CASYYCGCCK 7GCGGTGCCC AAGKTACTGG CSCAYCAAAA CCGCT22GGG 

^ = P ^ CKT AAWT ^ TGC::3 AA7TTCNTTC C77TGCGCCT T G AT AAATTT JJTNAACCCAC 

C3CAA.VCGTY CGGGCKTCTC 2TC-TGCCRA ATYCGRWTCC RATAYCGCOA TGGCCTNKTC 4 2 C 

:-7YCTYC:\YCS STAC Z CAAAT 3TTGGGTAT7 ITAIA^TXYC CCWAAANRCA AWTCTGGGTK 4 80 

^3~;^;^^ C "3G3KTCC7A AT T T AMM A C A L'GG 3TTTCTT TCWTACCAAA AACCSNTGGG -40 

— -LnCCRA AAAAKGATA.A. 7AA7AAKGTG OWWWCAAAA'J "O'^L'T" 

3TCCARCACC CCANGJJGGTO; ^C-GTNGGAAT 77CXAAC777 OA JCGCA 

7MTTTCY3GT C7WAAAAAC3 3GCCC:JC777 '.'kAAT^^TTT JK.CAACCCCA ^ACCTTTyAW 

ocNMinTCYY ycccncacaa ttiggsggnjg; v ;gs^^::ttyt twtttyyn::a jjggggrrwc 
g:;cccgj;aan yycctjaankg :;kcccggnma ^a,\gaga:;tt ycmkaaaaaj ■jccc::t:jc::c 
:;aaayacccc maaakwttc" aaasm scnng y — 



300 
3 6 0 



o w ^ 
o 6 0 



CO 
3 6 



rNFCr.MAT: 2V. 



strand e:;:;e37 
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Ml) 



— sggc Rawtyc~pac ecggaatwtg 



'TAATGGGR AA: ;~:'G GG GW 

^ — -nr^A~~ JCGGRWTCYG ^GTGGGGGEA A7TGGG3:GA7 



:;aggasyy jrgtaacggy gmcgaggg. ^ J ^ 



ooa 

660 
720 
780 



1041 



^""^G KTSRACTTTA AASGGTAATC ,00 

CGYCR^ ^£~CK Y^SS^ 5T S ™ C = WGGGTCTCOC CCTOGGYOIC 360 

— GGGCTG f ~j^£~* ™ TKGGTYCAAC CAACCCACTT CACMAAATTG 42 0 

^SS^° K TAATAA ^ SG ™KTCSGCC MYCACCGGWA 48 o 
- AlANC - C Gv-o^S^ owUATTTCC SAAATCATYT CCTTCTGRAC CCCCAMMRr 

CTNSAAATCC GRATCAATNC CCCNKGGCTT NTCYCTCTCN GTRCCCA^ ^ 
RKTNCCCYAA TSCAATTGGS TTYCCRTTSC YGS TTCCAAN TTNACAAMAS ^T^rt 
ACCAAAACCC NTGGS.CCNNA CMNAAAAKNA RAAAANAKGG KCTtSa^ £££££ 

SJSSS ™ M c Sr ™ KGN GAAAYTTHRA CCCAAN ^ SSSSi; 

™aSa« c c sssi r??o T c ssss 7 yccaaa ™ sa AAATmcnt 840 

A^A JTTTTGT NAAMCCCKMA YYTRTTWMCC WTTT^^^Vr- ann 

GCMCNNSNSG GNTNCGGTTY TYAT^C^m MrDWwcr^r WTTTT^GYC^ 900 

anu.L.^ MCRNNSGACN CCCCMNT^^" TVmrmjr^j a^n 

sssse ~s ™ cctk ~- 

12! ::jformat:c;; ?cr seq :g ;i C: 3 32: 

iEQUENGE GHARACTERISTZCC : 
,A; ^2»GTH : IC73 base oairs 
=.S- TYPE: nuclei- acid' 
; , G ■ S TPA:,13 ED?7H S S ■ single 
(2" TC?GE3GY: l;::ear 

ii- MGGEGGGE r-'?Sr Generic E*, T A 

>xi. secue:;ge ge5cp.:?t:g:i : sec :e :;o;33: : 

♦NSG3GMKKX A TAMA 7 C W C" G~SYA~~ V" jn — ^ 

—C— AA<- — ^^—^ ^M^rfATTGC GGCCGMAWTG 7NG7MAASAG 

-^ACM-GaI ^r— ™~V ^ CMAY " TC AAG 7 G 7RA Y Y GGG7CAGATA TCM7CGCGNG 

:gcgggggac -g^gaaaagg : — —Z'ZZZ .""^ cggga7gggt ratgcaacyt 



I 3 0 
34 0 

i 00 



^ - 7ACGKT7GGG " r " T " rr ^ A A r"^ n " -n >. t-»t^ - • - 

^i;^™: M :;^;:^ ;;^:' r? : -'^-^ ;^nnccacg:i .AACccGGrrrr 

"Ci.^;^:; :::, jGG ' r ^''^ -cggttgtgr .aay cttnatg cm^caaaag 

• _^ -^V7TSGG:, t t-^ ";ggggGG7GG — GGNAATn^ -—t-^^v^ 



4dC 

5 4 -J 

6 0 0 



WO 99/421 IS 



PC T/l S9g 0326S 



::iformat:on pgr 3 



EQ IC NC : 3 34 : 



•'i' SEQUENCE GKARACTERIGTIC^ 
:A; LENGTH: 996 oase pair- 
'' ^ 1 nucleic acid 

(C 1 STRANTECNESS : cinql- 
G TCPOI.CGV- linear 



120 

xao 

240 
300 
360 
420 



ixi- SEQUENCE DESCRIPTION: SEC 10 NC:333: 

3NSNGNKNTN TMCAYCWvr- Q^:rQr-^ 

* ^ * ^^CSG^TC iATTGCGGCC GCAA'™' rT T\rr^ 
CGATYTCGGC AMNANAARTG ~CG~C~- CA> -^.^J^ G ' -^^ATCT 6 0 

sssss j~ ---c s s 

™„ ™S ESSE 

illlpg = 

3GTTTTGGGC AACCCCN-V3 ^^f^ ^TAATCAC C GGG CMCNC CT 50C 

^-^^ -^^AAA CATTCCGSCC CAAATGGGNC 3TTGGSAAAT i6 - 

-SGKCAAAWS ' ■IAASAN CTTAMYCCAN 7TCGSSNTCC ~. C 

:^r^"f Gow\V AAGGGCCCCC CGGNTSCKCC GGGGKJCGCCC -GGK— CAA 

=^ ST «sc 3 3 :r.Tcsccccc csgccaagra cgguggt-^ ^33^ : 

„„„. _„ -.w^o KAYTTNKSCr ""JNAAAr"'-- ~ ~ - 

.^"T: .GGKTTCNNC CICCSGKKCT CCMTST— m r*RC— r~ A - 

" KCjMNN « -aaktmywkc cngcccnnak sscrr::— to 

jjj.«.jjvj U i( u .V-.. rt _-Yb:JCXT CTKnrrSMrw vv, ^„„„ .„ 3N 



j::.\'3::ijnkw>j atmcavcwy? ■■- 3CA c" -r-.-n 

VCC-GAAYT CGGCACANAG " GGCAC.V* ^TMAASAG 

iGTGCCGCSC CTGGTRASCA rM~AT""3C ^ "-A^C.*: li^T^*"* — ^SC — 

rCGCCTGCAA AAYCCTrCAG ~C"ACGV^ — ?CCSCGG GYCACCGGCK ; 9 C 

j C GGCCG RAT 3GTGG KY n A =; 



.AA.CAAr.VYCT CCTGCAACSC A2SCCGT7YC 



J-TYCSGGRA ACC-AACC~A AA^ — ~~ l"™™" - WYCSANft CC 20C 



GGKCG AMMCGC3RGN G - v-- 



*~~*amancag :;gc 

1 - .■ ^ juT3 C TNI 



- v ^ _j ._j"iA 

~:::g:.av^\c 



iV '' G ;gv - : --ataa .^mmmsggvg gamcggg 



GSNTTCGGm; VCATGGCTNN 



iAAA WTTTTYTTGN 3 C 

rrGG GNSGGARTKT 94C 
GGTNNNTN* K>JK7JCG3L\ST 9C; 



■^V.irCTJ;CC R G R CM ANT G G TNSGGAKTKT 
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(A, LENGTH: 1^4 base pairc 

(B) TYPE : nucleic acid 

(C) STRANDEDMESS : single 

(D) TOPOLOGY.- linear 



(11) 


MOLECULE T 


YPE: Genomic DNA 








(XI) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


3 3 5: 






NGNGGGNKRN 


ATMMAYCWCT 


SATYYACCSN 


GGMNMWATTG 


CGGCCRMAWT 


CTNGTMKASA 


60 


GATCTMGAAA 


YTCGGCAAAG 


AGYATKCTCG 


GGG G C CAGAT 


TTNTGGCCCG 






ACTTTGCAYW 


TCAACAKTCC 


3GGTGCCCCA 


AAAAAWTCWT 


ACCCCCATMC 


TYCKTGCASM 


190 


ASYTGCGCCC 


RATTRAACAC 


^CGGCCGGC^ 


TGCTGCGCCA 


GGTATTYCAS 


CAGYTCAAAY 




YCTTTKTAGK 


TAAAATC CAG 


C3GGCGGCCA 


CM CAG C C GGG 


CGGTKTAGGT 


GCCTYCRTCA 


3 0 C 


ATMACCAGCY 


C G C C CAG GG Y 


CACCTTGCCC 


AAAAYCTCCT 


GGGTCAG CCA 


AATTYCCGCS 


360 


CCGGCCAACM 


ACCANCCGCA 


7YCTGGCNTC 


AATCYCACCG 


_rG CCCGGTG1 


TAAAMMANMA 


420 


z-ratctcktc 


MANCCCCCAN 


TCAGCSYTNA 


CNGCMACAGC 




CAMACCGCCA 


48C 




■ — r\r\\- v. uu ^ . i 


GTCAAACi'CA 


A C A GG C GG N C 


AGGCCTCCCC 


CGGANSAAAG 


5 4 0 


3TCTTACS CC 


MIT/ AAN AAAA 


MAAGNTCTGT 




CAS AAS N AAA 


AANCCCCSGC 


□ 01 


C3GGCCTTCN 


MMMGGGT7TG 


GGG MAN AN AA 


AA?. 


GGAACSNATC 


CGAAAMCTCC 


5 6 0 


CAAGTCNCMT 


TWAWAACYCM 


NNAACCCCCC 


ANT _ TTGGGA 


AAGGNTCCCG 


MTTMYCCCCC 




TTTTA3 GKTS 


GGGMMYYCTY 


T AAAAAAA TT 




CCv_ ^_ vj'oGArto 


GGTCMAMCTG 


73 j 


jGNAAATTTC 


CAAM C CM"i\ G E 


T7NTTYNGG7 


TMC GGGGG RA 


AATTYCNCTC 


CC YYNNNGGG 


840 


CSSGSNNNAT 


TAYGGMSNMT 


TTTMNAAWTM 


NSGKKTSAMM 


YNNK C CMNNN 


SNNMSMANNK 


9C0 


tnamckcccm 


CCTCNGNGXY 


igcynqeccg 


GM/~lGMGGRAS 


MKCCNANMAA 


A Y AS GNTTNK 


96 0 


cggaammcnn 


AA7 KGMNNS C 


3CGGASMCMN 


M^ttlAAATMT 


cncmkctisnn 


AANRGMRACM 


" Z 2 C 


CCUNSNSGMN 


rrgaaplmtny 


YCCCCCGSKM 


G KGN KAAAAW 


GKYCCCCCCM 


AAAG 


: 0 7 4 






U.' FOP. 3E 1 


IE NC:33 6 









A: LENGTH: case cairc 

3 TYPE: nucleic ac:c' 

3 T? AND EDMES 3 : 3 1 r>a ■ e 
~ T3PGLCGY: _near 

.i MOLECULE ""YPE : Genomic DMA 




W O 99 42 1 1 8 



PC I ILS99 03265 



TATKSAGMGG TXCCGMAGMK CCSC3TTTKT TKT 3 A! I AAMK MSMRKNKJ3TG CGMGYTCTSC 96C 

GGGNTTTGTA 3AGTAKTCGS CSCS 3MWGAC WCSGMCMGNG AGKN>3TNNTS YANTGARCGY 102 0 

y^JSKTMKMT MSCSCGCGNA GGAGNGCCCC CSANGMSTGY NKGGNMSSNG ARAXGATGGS 108C 

3GCCNCGMNN MGMGGANMGA 5ANNGMGGMR GGGGGKTGKC TCKCSCCGNS C S AN G RA GAA 114 0 

3KTCNGSCGC C G MGG KY G K T KT KT KNKTGG YSTCMSSMMM NAGAAAAGAG AGGGC 1195 

(2) INFORMATION FOR SEQ ID NC:337 : 

(l) SEQUENCE CHARACTERISTICS: 
(AJ LENGTH: 3572 base pairs 

(B) TYPE, nucleic: acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

ii; MOLECULE TYPE ; Generic UNA 
■xi; SEQUENCE CESCRI FTICN : 3E3 13 NC : 3 3 7 : 



jTTG G 3. 3 AG CAT C G CJ- 



TGGTTTGTTG AAAACCGGA3 ATGGCACTCG AGTC3CCTTC GCGTTC2GCT ATC3GCT3AA : 3 ; 

.TTGATT3CG AGTGAGATAT TTATGCCAGC GAG 3 3 AG AC G CAGACGCG33 3AGA3AGAAC 13: 

TTAATG3GC3 C33TAA3A33 G33ATTTG3T GGT3A333AA TGCGACCAGA TGGT33ACGC 2 4 ( 

33AGTCGCGT ACCGT3TTCA TGGGAGAAAA TAATACT3TT GATGGGTGTC TGGTCAGAGA 

3ATCAAGAAA TAACGCC3GA ACATTAGTGC AGGCAGCTTC 3ACAGCAAT3 GCATC3TGGT 

3ATCCAGCGG ATAGTTAATG AT 3AG3CCA3 TGACG3GTTG 33CGAGAAGA TTGT3CACCG 

CCGCT.TACA G3CTTCGAGG 333CTTCGTT 3TA3CATC3A 3AC3ACCAC3 3TGGCACCCA 

W °^^ JJ ^" ~ . a^CGCCGCGA 3AATT3 2GA 3GGCGCGTG3 AGGGCG-AGAC 54 0 

GAGGTGGC AACG CC AATC AGCAA3GA3T CrV-3ZZZGZ CAGTTGTTGT 33 3AC3CGGT 6GC 

ATTCAGCTCC G33AT3GC3G 3TT33A3TTT TTCCCGCGTT TT 3 GC AG AAA 66 C 



:ggg 

3 t 'J- 



w . - -A — A3 33 3 3G AAA 



nurtun^AC 3 3 GCAT£ 



30C 
3nl 



4 8C 



^ - - -AC AT 33A33A3 337 3AATT3A3T" ^Z'ZZ'ZZZGGZ " 8 
--aAGGT'CT/T 33 3C3ATT33AT 3GTGT33333 AT3T3 3ACG 2 3 4 i 



- . ^..-v r^jf\ oC_CGAAGT3 3C3AG3CCGA I02C 

33A3 3AA 333CAC3T3T 3G33 33GGT3 10 8 0 

- ' ' ' * - — - 3 A3 AT 3 T 33 A3 333333 AAATT AATA 3 114 0 

aTATGGGC3A 1333 CAT 3 AT CATCAC3TGA 333 A 3 AT CAT "m? 



- ~ J v_ - ;uU/f. 



TGAACAT 



1 3 C ^ 

; d 6 
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211 



IT T I S99 03265 



GGG^. GGC'GGCAT CGAAAACCCC GGCGAACGAG GCGATTTCGA TGATCGACGG 
GCCCGCCCCG GACGGGTACG CGATCATCAA CTACGAGTAC GCCATCGTCA ACAACCGGCA 234C 
AAA GGACGCC GCGACGGCGC AGACCTTGCA GGCATTTCTG CACTGGGCGA TCACCGACGG 2 4 00 
GAACAAGGCC TCG^CCTCG AC CAGGTTCA TTTCCAGCCG CTGCCGCCCG CGGTGGTGAA 
G..GTCTGAC GCG..GATCG CGACGATTTC CAGCGCTGAG ATGAAGACCG ATGCCGCTAC 
CCTCGCGCAG GAGGCAGGTA ATTTCGAGCG GATCTCCGGC GACCTGAAAA CCCAGATCGA 
CGAGGTGGAG TCGACGGCAG GTTCGTTGCA GGGCCAGTGG CGCGGCGCGG CGGGGACGGC 
CGCCCAGGCC GCGGTGGTGC GCTTCCAAGA AGCAGCCAAT AAGCAGAAGC AGGAACTCGA 
CGAGATCTCG ACGAATATTC GTCAGGCCGG CGTCCAATAC TCGAGGGCCG ACGAGGAGCA 
GCAGCAGGCG CTGTCCTCGC AAATGGGCTT TGGATTCAGC TTCGCGCTGC CTGCTGGCTG 
GGTGGAGTCT GACGCCGCCZ ACTTCGACTA CGGTTCAGCA CTCCTCAGCA AAACCACCGG 
3GACCCGCCA TTTCCCGGAC AGCCGCCGCC GGTGGCCAAT GACACCCGTA TCGTG CTCGG 
GCGGCTAGAC CAAAAGCTTT ACGCCAGCGC CGAAGCCACC GACTCCAAGG CGGCGGCCCG 
G^GGGCTCG GACATGGG TG AGTTCTATAT GCCCTACCCG GGCACCGGGA TCAACCAGGA 
C? C ^ 7 ^ CG : ' /gacgc::a --CGGGGTGTG TGGAAGCGCG TCG7ATTAC3 AAGTCAAGTT 
GAGCGATCCG AGTAAGCCGA A CGGCC AG AT GTGGACGGGC GTAATCGGCT CGGCCGCGGC 
GAAC3CACCG GACGCZGGGG 2CCCTCAGCG ,'2TGGT5TGTG GTATGGCTCG GGACCGCCAA 



* G G ACAA G GG r 



280 



2460 
2 5 2 0 
2530 
2640 
2700 
276Q 
2820 
2880 
294C 
3000 
3060 
3120 
3180 



C 



. ^GCGGAA .^oAi'^'LGGC CTTTGGTC3C 23C0 

^^"^ — — GGGAAGTGGC TCCTACCCCG ACGACACCGA CACGGCAG^C- 3 3 60 

GCGTGAGAAT ^GTGCAGATA TCCATCACAC TGGCGGCCGG 7CGAGCAC 2A 34 2 0 

-_ACCAC CACT3AGATC CGGC~Gr~^ 2AAAGCGCGA AAGGAAGCTG ACTTGGCTGC 

^^IlT 7 TAGCATAA - ccttggggcg tctaaacggg TCTTGAGGGC 

. TTTGCTG AAAGGAGGAA CTATATGG3G AT 



34 8 0 
3 5 4 0 
35-72 



IXFGRMA 



3 38: 




WO 99 42118 
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i2) INFORMATION FOR SEQ ID WO: 340: 

:i l SEQUENCE CHARACTERISTICS : 
(A) LENGTH : io ammo acids 
IB) TYPE: amino acid 

(C) STRAND EDNES S : single 

(D) TOPOLOGY • linear 

(ii) MOLECULE TYPE: peptide 

;xi) SEQUENCE DESCRIPTION: SEC ID MO: 34 0 

Thr Tnr Pro Ser Xaa Val Ala ?he Ala Arg 
3 10 

'2; INFORMATION "OR SE- ID NO • 3 4 ' - 

SEQUENCE CHARACTERISTICS : 
A; LENGTH: 12 amino acidc 
iB 1 TYPE . ammo acid 
■, «- ^TRANDEENESS : s i ngle 
'E 1 TOPOLOGY: linear 

ii' MOLECULE TYPE: peptide 

xi SEQUENCE DESCRIPTION- 3E^ — \T - 4 " - 
Ala 31 V ->' s -la Sly >; dd Asr Val Xaa Arq 

I INFORMATION FOR j'EO IT NO - • .; ~ ■ 

SEQUENCE -HARACTERISTI ~:\ ■ 
A . LENGTH : 13 am no ac i :i s 
TYPE: amine a C :d 
STPANEEONESS : jmgle 



WO 99/42118 
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216 



(il) MOLECULE TYPE : Other 

ixi) SEQUENCE DESCRIPTION: SEC ID NO: 343 
IAGTTAGTA CTCAGTCGCA GACCGTG 

(2) INFORMATION *OR S £Q ID NO: 344: 

(i) SEQUENCE CHARACTERISTICS. 
(A) LENGTH; 25 base pairs 
(B! TYPE, nucleic acid 
tCJ STRAND EDNESS : single 
(D.) TOPOLOGY : linear 

in.- MOLECULE TYPE: Otihe^ 

-^CE.,_ -^ESC.Ii:- :CN: SEQ ID NO:344: 



iTQACG 



ACT C 



— I^*>!ATION FC?. se- - j. s 

(i; SEQUENCE CHARACTERISTICS: 
>A ; LENGTH: 14 12 base ca_-<; 
'3) TYPE: nucleic acid 

STRANDEDNESS : 3 male 

-^PO^CGY: linear 



MOLECULE 




WO 99/421 18 
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— cSSc^ SSS 1380 
££££ 3 ^ C G S ?™ SSST 0 acgagSS ~" -0° 

CAAATGGGCT 7TGTGCCCAC AGCAGCAGGC GCTGTCCTCG 15fi0 

GCACCGGCGA acCTGT^ ^™ C ^CGCTGC AGCGCCACCC : 620 

CCGGGCGATC CCAACGCAGC ACC^CctcS GC^SSS CCAACACGCC ^TGCCCAG I S80 
ATTGCCCCAA ACGCACCCCA ACCESS ATr™ ACGCA ^GCC GCCACCTGTC 1740 

OCGCTGCCTG CTGGCTGGGT ££££££ SSS" »«0 

SE~ ~ £=« ~ - 

TGCCTCSCGA CGGCC^ -"a— I <™™ =220 

ATCCGGCCTT TGGTCGCr-" ,C-3^^ C ™rr:r CCAAGGC ^ -CCOAATCG ;; 80 ' 

TTACCGGCCTT GA — m .-ACCGACACC 3CAGCGGACC 



\2) INFORr^ATlOr: for seq id 

J SEQUENCE CHARACTERISTICS: 
(A; LENGTH: 9C2 amine acids 
:3.' TYPE: amir.c acid 
' STRAND EDNE3 3 : .iinale 
(Dl TOPOLOGY. linear 



24 0C 
24 1Z 



MCL EC T JLC TYPE; 
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^ 4. 3 

Asp 
Trp 

Pro 
Ala 



180 :a5 
Lys Leu Asn Gly Lys Val Leu Ala 

1^5 2 oo 
Thr Trp Asp Asp Pro Gin lie Ala 
210 215 
Pro Gly Thr Ala Val Val Pro Leu 
230 

Thr Phe Leu Phe Thr Gin Tyr Leu 
245 

Gly Lys Ser Pro Gly Phe Gly Thr 
260 265 
Gly Ala Leu Gly Glu Asn Gly Asn 
275 280 

Glu Thr Pro Gly Cys Val Ala Tyr 
290 295 
Ala Ser Gin Arg Gly Leu Gly Glu 



31 y Asn Phe Leu 



110 



335 



Phe Ala Ser Lys Thr Pro Ala Asn 

Pre Ala Pro Asp Gly Tyr Pre lie 

35S 360 
Asr. Asn Arg Gin Lys Asp Ala Ala 

Leu His Trp Ala :ie Thr Asn G ' • ' 

390 

Val His Phe CT. r. Pro Leu Pro Pro 

-eu A^a "hr Tie Ser Ser A^u 

-eu Axu 31r. V. : Ala Gly Asn Ghu 



Ala Met 

Ala Leu 

His Arg 
235 
Ser Lys 
250 

Thr Val 

Gly Gly 

He Gly 

Ala Gin 

3c: I \ e 
330 

Gin Ala 

lie Asr: 

Thr AI t i 

Asn Lys 
395 
Ala Val 
4 10 

31u Met Lys Tnr 
Hu Arc: He Ser 



Tyr Gin 
205 
-Asn Pro 
220 

Ser Asp 
Gin Asp 
Asp Phe 

Met Val 

285 
He Ser 
300 

Leu Gly 
■j In rt ;a 



365 
Gin Thr 
380 

Ala Ser 
Val Lvs 



19C 

Gly Thr lie 

Gly Val Asn 

Gly Ser Gly 
240 

Pro Ciu Gly 

255 
Pro Ala Val 

270 

Thr Gly Cys 

Phe Leu Asp 

Asn Ser Ser 
32C 

^^a A.a Ala 

335 

Met He Asp 

3 5 0 

Tyr Ala He 

Leu Gin Ala 

Phe Leu Asp 
400 

Leu Ser Asp 
4 1 5 

Asp Aid A I a 
43 0 

Gly As;: Leu 



Va i 
455 



" -* o e r n r .-i j. a H y S e : 
4 6 0 

ir Ala Ala Sin Ala Ala 
4 7 5 

i 9 0 



Val Val Arg 
48C 

Glu He Ser 
4 95 
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Gly Asp Pro Pro Phe Pro Glv G 1 ~ 
625 63 0 
Arg He Val Leu Gly Arg Leu Asp 
645 

Ala Thr Asp Ser Lys Ala Ala Ala 
660 

Phe Tyr Met Pro Tyr Pro Gly Thr 
675 680 
Leu Asp Ala Asn Gly Val Ser Gly 
690 695 

Phe Ser Asp Pro Ser Lys Pro Asn 
7 °5 710 
Gly Ser Pro Ala Ala Asn Ala Pro 
7 2 5 

Phe Val Val Trp Leu Gly Thr Ala 
740 

Ala Lys Ala .eu Ala Glu Ser :ie 

A.a Pro Ala Pro Ala Pro Ala Glu 

"° ^ 7-5 
G-y Gli: Val Ala Pro Thr Pro Th 1 - 

Pro Ala 



Pro Pro Pro Val Ala Asn 
63 5 

Gin Lys Leu Tyr Ala Ser 

550 

Arg Leu Gly Ser Asp Mec 
665 

Arg He Asn Gin Glu Thr 
685 

Ser Ala Ser Tyr Tyr Glu 
700 

Gly Gin He Trp Thr Gly 
715 

Asp Ala Gly Pro Pro Gin 

Asn Asn Pro Val Asp Lys 
74 5 750 
Arg Pro Leu Val Ala Pro 

Pro Ala Pro Ala Pro Ala 
^30 

Thr Pro Thr Pro Gin Arg 

795 



Asp Thr 
640 
Ala Glu 
655 

Gly Glu 

Val Ser 

Val Lys 

Val lie 
720 
Arg Trp 

735 

Gly Ala 



Pro Ala 

Thr Leu 

8 30 



information POP sec io :;c : ^ 

) SEQUENCE CHARACTERISTICS 
■,.*\/ LENGTH • 34 base oairs 

3 ~ - ^ . r.uc.e:: a c : o 

C: STFANOECNESS : single 

C; TOPOLOGY ■ linear 



MOLE "CLE 



Itner 



rCCAAAC CACCGAGCGG TTC 



STRANGENESS ;n ^* 
10 PC LOG 7 linear 
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,i SEQUENCE CHARACTERISTICS 
■ Ai LENGTH: 1962 base pairs 
:BJ 77YPE : nucleic acid 
■'Cl STRAND EDNESS : jingle 
.D) TOPOLOGY: linear 

Hi) MCLSCJLE TY?E: cDNA 

(XI) SEG/JENCE DESCRIPTION: SEQ ID NO: 34 9: 

CAT ATGGG C C ATCATCATCA TCATCACGGA TCCAAACCAC CGAGCGGTTC CCCTGAAACG S 0 

GGCGCCGGCG CCGGTACTGT CGCGACTACC CCCGCGTCGT CGCCGGTGAC CTTGGCGGAG 12 C 

ACCGGTAGCA CGCTGCTGTA CCCGCTGTTC AACCTGTGGG GTCCGGCCTT TCACGAGAGG 10 0 

TATCCGAACG TCACGATCAC 33CTCAGGGC ACCGGTTCTG GTGCGGGGAT CCCCCAGGCC 2 AO 

GCCGCCGGGA 33GTCAACAT GGGGGCCTCC GACGCCTATC TGTCGGAAGG TGATATGGCC 3 00 

GCGCACAAGG GGC7GATGAA 3ATCGC3CTA GCCATCTCCG CTCAGCAGGT CAACTACAAC 360 

7TGCCC3GAG ""GA3CGAGGA 33TCAA3CT3 AACGGAAAAG TCCTGGCGGC CAT G TAG GAG * 42 C 

7CCGGCACC3 CGGTAGTTCC 33TGCA27GC TGCGACGGGT 3CGGTGACAC 3TT3TTGTT7 54 0 

ACCGTGGACT TCZZGGZC^Z G7CGGGT3CG C7GGGTGA3A ACGGCAACGG CGGCATGGTG S60 

GCCAGTCAAC GGGGACTCGG CGAGG7C3AA CTAGGCAATA GCTCTGGCAA TTTGTTGTTG 78 0 

CC3GACGC3 3 AAAG C AT T CA 3GCC333GC3 3CT3GGTT3G 3ATCGAAAAG CCCGGCGAAC 34C 

TAC3CCATCG TCAACAAC CG 37AAAAGGAC GCCG3CACCG CGCAGACCTT GCAGGCATTT 96 0 

^ AT ~ A f~ 3A CGGCAACAAG 3GCT3GTT3C TC 3AGCAGGT TGATTTCGAG 1020 

GCGCTGC 3 2 23GCGG.3GT GAAGTT3TCT GAG 3C3TT 3 A TCGCGACGAT TTCCTCGGGA 109 0 

3GTGGCA37G GGGGAGGCTC AGGTGGAGGT -™3GC:G3A GC 3TGCCCAC AACGGCGGCG 1140 

OOGGIGGGGG CCAACACGCC 3AAT3CCCAG ~7GGGC7ATC CCAACGCAGC A3GT7C3CCG 12 6 0 

.■a CCGACAA OC 3 3G7TG3A3G ATTCAG7TTC 3CG7TG 7773 "TGGCTGGGT 7GAGTGT3AC 0 3 6 l 

AAGC77TTACC CCAGC3CC3A AG G GAG 3 GAG 7CCAAGGCCG 7G3GCCCGGTT GGGCT^GGAC ;S6C 

\TGGGTGAGT TCTATATG 2C 2TACCCGGG3 A7C7GGAT7A A G G A G G AA A G GGTCTCGG"" : ^ 2 7 

:ACC2GAAGC 3GGT3T3T33 .^3C3CGTC3 7ATTACGAAG T'CAAGTTCAG CGATCCGAGT ^37 

:r AGAT::T3 3 2;: G r GTA A ™ G ~™ ™~™aa ggcag-gag - 4 r 

L'wj',;^.;-...., .; .. .-.aj^vj,^ ... . . _ ..;-vA"""7G •. ~" *7G 7 0 7 ' : ~"""7~ C ~ r" " V"*""' ; "t ~ " 
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Men Gly His His His His His His Gly Ser Lys Pro Pro Ser Giv 3er 

1 3 10 is' 

Pro Glu Thr Gly Ala Gly Aia Gly Thr Vai Ala Thr Thr Pro A' d Se- 

2C =5 30 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu T/r Pro Leu 

35 40 4s 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Axg Tyr Pro Asr. Val Th- 

b0 SS 60 

lie Thr Aid Gin Gly Thr Gly Ser Gly Aia Glv He Ala Gin Ala Ala 
65 70 75 ao 

Aid Gly Thr Va; Asn He Gly Aia Ser Asp Ala Tyr Leu Ser Glu Gly 

95 90 95 

Asp Men Ala Ala His Lys Gly Leu Me: Asn lie Aia Leu Aia :ie Se- 

ioo 10s ilQ 

A. a Gir. Gin Val Asr. Tyr Asn Leu Pro Glv Vai Ser Glu His Leu Lys 

us i:o 12S 

"'- As " z Va - A1 --< ;vr Gin Giv Thr lit- Lvs "h- 

130 "-35 i40 

Trp Asp Asp Pro Gir. lie Ala Ala Leu Asn Pro Giv Val Asr. Leu 

145 t . 1S: 155 ' ;sc 

l.v ... r A. a Va. Val Pri Leu K:s Ar 3 Ser Asp Gly Ser Gly Asp Thr 

1=5 170 ' 175 

Phe Leu Phe Thr Gir. Tyr Leu Ser Lys Gir. Asp Pro Glu Giv Trp Civ 

190 ; 35 :90 

Lvs Ser Pro Giv Phe Gly Thr Thr V., 1 Asr Phe Pro Aia Val Pro Si- 

195 -00 " :05 

A.a Leu Gly Glu Asr. Gly Asr. Gly Gly 1 s: Vai Thr 3 ;,. A , a ^„ 

210 —= .12 0 

Thr Pro Gly Gvs Val Ala Tyr lie 3 ;,. :l e Ser Phe Leu Asc Gir. Aia 

33C 235 ' 2i0 

'* T " r - - eu ' r - v 3iU :.e« ".I-/ ^sr. ser Ser Jlv Asr 

"•• • • a:, ,: , a;, a., - , he 

- ° - Zi~ - 

ser Lys vhr Pro M a Asn Gin Aia ;:- s-r ^ :; e Asr si- - 3 

:3C :3S 

" la 3r = " 3p J: >' - r ^ ^n Tyr 1 ] Tyr Al, :> .a. ;sr. 

:9C =3- 100 



^r-j .llr. .'/i Asr 



• •: : 

T^r T.e ser Jlv Glv ;v . Jer G.v 3;-. 11, 

360 ' ' 3- 

1J ->' v ^ Tr.r Thr Ala Ala Ser Pre ?:■_■ 

— 33C 
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ASE Md Glrl Pr ° Val ^ IIe *sr. Pro va; Gly Glv Phe Ser 

440 445 
Phe Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala ai a Phe Asp 



Tyr Glv Ser 



4S5 



460 



465 LeU Leu Ser ^ s Thr Gly Asp Pro Pro Phe Pro 

470 47 => 480 



Gly Gin Fro Pro Pro Val Ala Asn Asp Thr Arg lie Val Leu Gly "° 

485 490 495 

Leu Asp Qla Lys Leu ryr Ala Ser Ala Glu A la Thr Asp Ser Lys Ala 



505 510 
Axa Ala Arg Leu Gly Ser Asp Met Gly Glu Phe Tyr Met Pro Tyr Pro 

- ^ I - 520 525 

^7 .hr Arg .. e Asn Gin Glu Thr Val Ser Leu A sn .Ala Asn Gly Va ' 

. !?° , " 5 540 

^ aer Ala 3er ^ ^ r -u Val Ly 3 Phe Ser Aso Pro Ser Lvs 

550 



?rs * l ? ^ ^ ^ ^ -1/ va: ::e cS ser Pro L" 

- □ . , ^ 570 575 
.-a .ro „sd A. a Gly Pro Pro Gin Ara Tn, Phe 7a I Va ' : eu 

580 5H= ~ ' 



rhr Ala Asn Asn Pro Val 



Asp Lys Giv Al 



t\+a Lys Aid Lei: Ala Glu 

6 3 0 r ^ 

3er lie Arc Pro 



555 630 



610 



Arg Pro Leu Val Ala Pro Pro Pro Ala Pro All Pro Ala Pro 



615 



la Jlu Pro 



620 



■ ?r ° Ala ?r:5 Ala Ala Glv Glu Val Ala Pro Thr 

Pro Thr Thr ?— -u. - j5 *-*0 



"0 . r.r Pro Gin Arg Thr Leu Pr 

d 5 C 



o Ala 

54 5 
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